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Abstract 

Logical  characterizations  of  the  common  prior  assumption  (CPA)  are  investi¬ 
gated.  Two  approaches  are  considered.  The  first  is  called  frame  distinguishability , 
and  is  similar  in  spirit  to  the  approaches  considered  in  the  economics  literature. 
Results  similar  to  those  obtained  in  the  economics  literature  are  proved  here  as 
well,  namely,  that  we  can  distinguish  finite  spaces  that  satisfy  the  CPA  from  those 
that  do  not  in  terms  of  disagreements  in  expectation.  However,  it  is  shown  that, 
for  the  language  used  here,  no  formulas  can  distinguish  infinite  spaces  satisfying 
the  CPA  from  those  that  do  not.  The  second  approach  considered  is  that  of  finding 
a  sound  and  complete  axiomatization.  Such  an  axiomatization  is  provided;  again, 
the  key  axiom  involves  disagreements  in  expectation.  The  same  axiom  system  is 
shown  to  be  sound  and  complete  both  in  the  finite  and  the  infinite  case.  Thus,  the 
two  approaches  to  characterizing  the  CPA  behave  quite  differently  in  the  case  of 
infinite  spaces. 


*Work  supported  in  part  by  NSF  under  grant  IRI-96-25901  and  by  the  Air  Force  Office  of  Scientific 
Research  under  grant  F49620-96-1-0323. 


1  Introduction 


The  common  prior  assumption  (CPA)  is  one  that,  up  until  quite  recently,  was  almost 
an  article  of  faith  among  economists.  This  assumption  says  that  differences  in  beliefs 
among  agents  can  be  completely  explained  by  differences  in  information.  Essentially,  the 
picture  is  that  agents  start  out  with  identical  prior  beliefs  (the  common  prior)  and  then 
condition  on  the  information  that  they  later  receive.  If  their  later  beliefs  differ,  it  must 
thus  be  due  to  the  fact  that  they  have  received  different  information. 

The  CPA  has  played  a  prominent  role  in  economic  theory.  Harsanyi  [1968]  showed 
that  a  game  of  incomplete  information  could  be  reduced  to  a  standard  game  of  imperfect 
information  with  an  initial  move  by  nature  iff  individuals  could  be  viewed  as  having  a 
common  prior  over  some  state  space.  Aumann  [1976]  showed  that  individuals  with  a 
common  prior  could  not  “agree  to  disagree” ;  that  is,  if  their  posteriors  were  derived  from 
a  common  prior  and  they  had  common  knowledge  of  their  posterior  probabilities  of  a 
particular  event,  these  posteriors  would  have  to  be  the  same. 

The  CPA  has  come  under  a  great  deal  of  scrutiny  recently.  (See,  for  example,  the 
exchange  between  Gul  [1998]  and  Aumann  [1998];  see  [Morris  1995]  for  an  overview.) 
In  an  effort  to  try  to  understand  the  implications  of  the  CPA  better,  there  have  been 
a  number  of  attempts  to  characterize  the  CPA.  Of  most  relevance  here  are  the  results 
of  Bonanno  and  Nehring  [1999],  Feinberg  [1995,  2000],  Morris  [1994],  and  Samet  [1998], 
who  all  showed  that,  in  finite  spaces,  the  common  prior  could  be  characterized  by  a 
disagreement  in  expectations,  in  a  sense  explained  below.  Feinberg  [2000]  extended  this 
result  to  infinite  spaces  that  satisfied  a  certain  compactness  condition,  and  also  showed 
that  this  compactness  condition  was  necessary. 

This  paper  continues  these  efforts.  I  characterize  the  CPA  using  traditional  tools  from 
modal  logic,  and  compare  these  characterizations  to  those  used  in  the  economics  litera¬ 
ture.  In  the  process,  I  highlight  the  role  of  the  language  used  in  getting  a  characterization. 
Feinberg  [2000]  showed  how  to  characterize  the  CPA  in  syntactic  terms ,  essentially  using 
a  logic  with  operators  for  knowledge  and  probability.  I  use  a  much  richer  language  here, 
one  introduced  in  [Fagin  and  Halpern  1994],  which  has  operators  for  knowledge,  common 
knowledge,  and  probability.  Feinberg’s  language  is  weaker  than  that  used  here  in  two  sig¬ 
nificant  respects.  The  first  is  that  it  does  not  include  an  operator  for  common  knowledge. 
To  get  around  this,  his  characterization  involves  infinite  sets  of  formulas.  The  second  is 
that  the  operators  in  his  language  do  not  allow  us  to  express  expectation.  In  particular, 
this  means  that  disagreement  in  expectation  cannot  be  expressed.  Feinberg  gets  around 
this  by  an  ingenious  construction  that  involves  adding  coin  tosses  to  the  description  of 
the  world,  in  order  to  construct  a  more  complex  model.  In  this  model,  disagreement  in 
expectation  is  converted  to  disagreement  between  two  agents  about  the  probability  of  an 
event,  and  this  fact  can  be  expressed  in  Feinberg’s  language.  By  using  a  richer  language, 
the  need  for  this  construction  is  completely  obviated. 

However,  characterizing  the  CPA  involves  more  than  just  language.  It  depends  on 
what  counts  as  a  characterization.  I  consider  two  quite  different  characterizations  here. 
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One  is  called  frame  distinguishability ,  and  is  very  similar  in  spirit  to  the  types  of  char¬ 
acterization  considered  in  the  economics  literature.  Not  surprisingly,  the  results  I  obtain 
for  frame  distinguishability  are  quite  similar  to  those  obtained  in  the  economics  literature 
(and  much  the  same  techniques  are  used).  In  particular,  I  show  that  finite  frames  (essen¬ 
tially,  finite  spaces)  that  satisfy  the  CPA  can  be  distinguished  from  those  that  do  not  in 
terms  of  disagreements  in  expectation.  However,  there  are  no  formulas  in  the  language 
considered  here  that  can  distinguish  infinite  spaces  satisfying  the  CPA  from  those  that 
do  not. 

The  second  type  of  characterization  I  consider  is  that  of  finding  a  sound  and  complete 
axiomatization.  I  provide  such  an  axiomatization;  again,  the  key  axiom  involves  disagree¬ 
ments  in  expectation.  The  same  axiom  system  is  shown  to  be  sound  and  complete  both 
in  the  finite  and  the  infinite  case.  Thus,  the  distinction  between  finite  and  infinite  spaces 
vanishes  when  we  consider  axiomatizations.  Roughly  speaking,  this  can  be  understood 
as  saying  that  the  language  is  too  weak  to  distinguish  finite  from  infinite  spaces  (despite 
being  much  stronger  than  that  considered  by  Feinberg). 

It  may  seem  strange  at  first  that  a  language  not  rich  enough  to  provide  a  distinguishing 
test  can  still  completely  characterize  all  the  properties  of  a  notion  of  interest  in  that 
language.  But  this  phenomenon  is  actually  quite  familiar  in  other  areas  of  mathematics. 
There  is  a  well-known  complete  axiomatization  of  the  real  numbers  with  addition  and 
multiplication  due  to  Tarski  [1951].  Nevertheless  there  are  nonstandard  models  of  the 
reals  that  satisfy  the  same  axioms,  so  the  language  cannot  distinguish  the  standard 
models  from  the  nonstandard  models.  (This  observation  is  in  fact  the  basis  for  the  whole 
enterprise  of  nonstandard  analysis  [Davis  1977].)  Essentially,  I  show  that,  just  as  there 
are  nonstandard  models  of  the  reals  that  satisfy  all  the  properties  of  the  reals,  there 
are  “nonstandard”  models  that  satisfy  all  the  properties  of  the  CPA  expressible  in  the 
(rather  rich)  language  considered  here  yet  do  not  satisfy  the  CPA. 

A  natural  question  to  ask  is  which  of  the  two  types  of  characterization  I  consider  is 
more  appropriate.  That,  of  course,  depends  on  the  application.  If  we  are  interested  in 
testing  whether  the  CPA  holds  in  a  given  space,  this  is  a  question  essentially  about  frame 
distinguishability.  As  it  happens,  if  a  finite  space  does  not  satisfy  the  CPA,  there  is  a 
single  formula  that  will  be  true  in  that  space  that  is  not  true  in  any  space  that  satisfies 
the  CPA.  Moreover,  that  formula  is  one  that  the  agents  themselves  know  to  be  true,  so 
not  only  can  the  modeler  make  the  distinction,  the  agents  themselves  can.1  On  the  other 
hand,  suppose  rather  than  being  given  a  particular  space,  all  that  the  modeler  is  given 
is  a  finite  collection  E  of  facts  about  the  space.  (For  example,  E  may  give  information 
about  the  agents’  knowledge  and  beliefs.)  Note  that  E  will  in  general  not  determine  a 
single  space;  there  may  be  a  number  of  spaces  compatible  with  E.  The  modeler  may  then 
be  interested  in  knowing  what  follows  from  the  CPA  together  with  E  as  opposed  to  just 
from  E  alone.  That  is,  what  extra  conclusions  follow  from  the  CPA,  given  E.  This  is  a 
question  that  can  be  answered  using  a  complete  axiomatization — frame  distinguishability 

1There  are  some  subtle  computational  issues  here  though;  see  Section  3.1. 
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is  of  no  help  at  all.  Thus,  for  example,  the  axiomatization  can  be  viewed  as  providing, 
among  other  things,  an  exact  characterization  of  the  extent  to  which  the  CPA  implies  no 
common  knowledge  of  disagreement. 

The  rest  of  this  paper  is  organized  as  follows.  In  Section  2,  I  carefully  define  the 
language  considered  and  its  semantics.  In  Section  3,  I  consider  the  two  types  of  charac¬ 
terizations.  I  also  consider  what  happens  if  the  common  knowledge  operator  is  not  in  the 
language.  In  this  case,  I  show  that  there  are  no  new  consequences  of  the  CPA.  This  re¬ 
sult  is  similar  in  spirit  to,  but  different  from,  one  of  Lipman  [1997].  Lipman  showed  that 
there  are  some  (albeit  weak)  consequences  of  the  CPA,  even  without  common  knowledge 
in  the  language.  The  differences  in  our  results  are  attributable  to  a  small  but  significant 
difference  in  our  definitions  of  the  CPA  in  the  case  when  there  are  information  sets  with 
prior  probability  0;  see  Section  3  for  details.  I  conclude  in  Section  4  with  some  discussion 
of  these  results. 

Most  proofs  can  be  found  in  the  appendix. 


2  Syntax  and  Semantics 

To  reason  formally  about  knowledge  and  probability,  the  standard  approach  in  the  liter¬ 
ature  in  philosophy  and  mathematics,  which  has  also  been  adopted  in  computer  science, 
starts  with  a  language  (the  syntax).  Of  course,  there  is  some  flexibility  in  exactly  what 
language  should  be  chosen.  Since  I  want  to  reason  about  knowledge,  common  knowledge, 
and  probability  here,  I  use  the  syntax  first  defined  in  [Fagin  and  Halpern  1994],  that  lets 
us  reason  explicitly  about  all  these  notions.  This  choice  of  language  (particularly  the 
assumption  that  the  language  includes  common  knowledge)  has  nontrivial  consequences 
for  the  results  of  this  paper,  as  we  shall  see.  I  return  to  the  issue  of  the  choice  of  lan¬ 
guage  in  Section  4;  for  now  I  just  focus  on  this  language  (occasionally  without  common 
knowledge). 

Suppose  we  consider  a  system  with  n  agents,  say  1 , .  . . ,  n.  and  we  have  a  nonempty 
set  $  of  primitive  propositions  about  which  we  wish  to  reason.  (Think  of  these  primitive 
propositions  as  representing  basic  events  such  as  “agent  1  went  left  on  his  last  move” .)  We 
take  £^,c,pr  to  be  the  least  set  of  formulas  that  includes  $  and  is  closed  under  the  following 
construction  rules:2  If  ip,  <p',  <pi, . . . ,  <pm  are  formulas  in  £^'C'pr.  then  so  are  -up,  <p  A  <£>', 
Ki<p ,  i  =  1, . . . ,  n,  (which  is  read  “agent  i  knows  <p”),  Cip  (“<p  is  common  knowledge”), 
and  aipr^ipi)  +  •  •  •  +  ampr,((pm)  >  6,  where  eq, . . . ,  am,  b  are  rational  numbers,  (pq(</?) 
is  read  “the  probability  of  <p  according  to  agent  i”).  Let  £^,pr  consist  of  all  the  formulas 
in  £^'c,pr  that  do  not  mention  the  C  operator. 

As  usual,  (pVp1  and  <p  =^>  <p'  are  abbreviations  for  and  -upVip',  respectively. 

In  addition,  Elp>  (“everyone  knows  <p”)  is  an  abbreviation  for  K\ip  A . . .  Kn<p  and  Em+1ip 

2Strictly  speaking,  I  should  write  C^,c,pr({ h),  because  $  is  also  a  parameter  of  the  language,  just  as 
n  is.  However,  I  omit  it  here,  to  simplify  the  notation. 
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is  an  abbreviation  for  E1Emp  (“everyone  knows  that  everyone  knows  . .  .that  everyone 
knows — m  +  1  times — <p”),  for  m  >  1.  Many  other  abbreviations  will  be  used  for  rea¬ 
soning  about  probability  without  further  comment,  such  as  pni.'-p)  <  b  for  ~^(pri(p)  >  6), 
pri{<p)  >  b  for  —pri(p)  <  —  6,  and  pr^p)  =  b  for  pri(p)  <  b  A  pri(p)  >  b.  Note  that 
we  can  express  simple  conditional  probabilities  such  as  pri(p\ip )  =  2/3  by  clearing  the 
denominator  to  get  pr^p  A  ip)  =  |pa(^). 

The  operators  A/  and  C  allow  us  to  reason  about  knowledge  and  common  knowledge, 
respectively.  Formulas  such  as  aipr^pi)  +  •  •  •  +  a..mprl{p.m)  >  b  are  called  i-probability 
formulas ;  they  allow  us  to  express  a  number  of  notions  of  interest.3  Note  that  by  using 
i-probability  formulas,  we  can  also  describe  agent  i’s  beliefs  about  the  expected  value 
of  a  random  variable,  provided  that  the  worlds  in  which  the  random  variable  takes  on 
a  particular  value  can  be  characterized  by  formulas.  For  example,  suppose  that  agent  1 
wins  $2  if  a  coin  lands  heads  and  loses  $3  if  it  lands  tails.  Then  the  formula  2pr\{heads )  — 
3pri(tails )  >  1  says  that  agent  1  believes  his  expected  winnings  are  at  least  $1.  This  is 
a  much  richer  language  for  expressing  an  agent’s  beliefs  than  that  used  in  the  relevant 
literature  in  economics  (for  example,  [Feinberg  2000]),  although  the  belief  indices  of 
Bonanno  and  Nehring  [1999]  provide  a  semantic  way  of  expressing  yet  richer  notions. 

To  assign  truth  values  to  formulas  in  C^,c,pr1  we  need  a  semantic  model.  The  basic 
semantic  model  we  use  is  a  (Kripke)  frame  (for  knowledge  and  probability  for  n  agents). 
This  is  a  tuple  F  =  {W.  K i, . . . ,  ICn,  VIZi , . . . ,  TlZn ),  where  IT  is  a  set  of  possible  worlds 
or  states,  /Ci, . . . ,  Kn  are  equivalence  relations  on  IF,  and  VTZx, . . . ,  TlZn  are  probability 
assignments ;  VIZi  associates  with  each  world  w  in  IF  a  probability  space  V1Zi(w)  = 
(WWji,  Pr^).  Intuitively,  Ki(w)  =  {w1  :  (w.  w1)  €  Kf\  is  the  set  of  worlds  that  agent 
i  considers  possible  in  world  w  and  TlZ^w)  is  the  probability  space  that  agent  i  uses  at 
world  w.  VIZi  must  satisfy  the  following  three  assumptions. 

Al.  WWA  =  JCi(w):  that  is,  the  sample  space  at  world  w  consists  of  the  worlds  that 
agent  i  considers  possible  at  w. 

A2.  If  w'  e  then  VF-fw)  =  VlZi(w'):  at  all  worlds  that  agent  i  considers 

possible,  he  uses  the  same  probability  space. 

A3.  XWji,  the  set  of  measurable  sets,  includes  K, i(w)  f~l  Kj{w')  for  each  agent  j  and  world 
w'  e  K, i(w).  Intuitively,  each  agent’s  information  partitions  are  measurable. 

Apart  from  minor  notational  differences,  a  Kripke  frame  is  the  standard  model  used  in  the 
economics  literature  to  capture  knowledge  and  probability  (see,  for  example,  [Feinberg 
2000]);  K, i(w)  is  usually  called  agent  i’s  information  set  at  world  w.  In  the  economics 
literature,  an  agent’s  knowledge  is  usually  characterized  by  a  partition,  but  this,  of  course, 

3Note  that  the  syntax  does  not  allow  “mixed”  formulas  such  as  pri(<pi)  +  pr 2(^2)  >  1.  There  would 
be  no  difficulty  giving  semantics  to  such  formulas,  but  the  results  on  complete  axiomatizations  become 
more  difficult  if  we  allow  them.  Thus,  for  ease  of  exposition,  they  are  disallowed  here,  just  as  they  are 
in  [Fagin  and  Halpern  1994]. 
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is  equivalent  to  using  an  equivalence  relation.4  I  sometimes  describe  a  relation  /C,  by 
describing  the  partition  it  induces. 

A  frame  does  not  tell  us  how  to  connect  the  language  to  the  worlds.  For  example,  it 
does  not  tell  us  under  what  circumstances  a  primitive  proposition  p  is  true.  To  do  that,  we 
need  an  interpretation ,  that  is,  a  function  that  associates  with  each  primitive  proposition 
an  event,  namely,  the  set  of  worlds  where  it  is  true.  The  traditional  way  of  capturing  this 
in  the  logic  community  is  by  taking  7r  to  be  a  function  that  associates  with  each  world  w 
a  truth  assignment  to  the  primitive  propositions  in  $;  i.e.,  ,  7 r(w)(p)  G  {true,  false}  for 
each  primitive  proposition  p  G  $  and  each  world  w  G  W.  A  (Kripke)  structure  (for  knowl¬ 
edge  and  probability  for  n  agents )  is  a  tuple  M  =  (IT,  K,\. . . . .  Kn ,  W\, . . . ,  VI Zn,  7r), 
where  F  =  (IT,  JCi, . . . ,  Kni  VIZi, . . . ,  Wn)  is  a  frame  and  7r  is  an  interpretation,  with 
the  restriction  that 

A4.  JCi(w )  fl  {p]m  G  %w,i  for  each  primitive  proposition  p  G  <4>,  where  \p\m  =  {w  '■ 
7 t(w)(p)  =  true}  is  the  event  that  p  is  true  in  structure  M.  Intuitively,  this  makes 
{p]m  a  measurable  event  at  every  world. 

We  say  that  the  structure  M  is  based  on  the  frame  F.  Note  that  there  are  many  structures 
based  on  a  frame  F,  one  for  each  choice  of  interpretation. 

Kripke  structures  for  knowledge  and  probability  were  first  considered  in  [Fagin  and 
Halpern  1994],  but  A1-A4  were  not  required  in  the  basic  framework.  These  four  re¬ 
quirements  correspond  to  the  requirements  denoted  CONS  (for  consistency ),  SDP  (for 
state- determined  probability),  and  MEAS  (for  measurability)  in  [Fagin  and  Halpern  1994], 

We  can  now  associate  an  event  with  each  formula  in  C(f,c,pr  in  a  Kripke  structure.  We 
write  (M,  w)  j=  p  if  the  formula  p  is  true  at  world  w  in  Kripke  structure  M;  generalizing 
the  earlier  notation,  we  denote  by  {p]m  =  {w  :  (M,  w)  f=  <p}  the  event  that  <p  is  true 
in  structure  M.  We  proceed  by  induction  on  the  structure  of  ip,  assuming  that  we  have 
given  the  definition  for  all  subformulas  ip'  of  <p  and  that  0  K, i(w)  G  Xw.p  that  is, 

the  event  corresponding  to  each  formula  must  be  measurable. 

(M,  w)  |=  p  (for  p  G  <4>)  iff  7 x{w){p)  =  true 
(M,  w)  (=  <p  A  <p'  iff  (M,  w)  \=  <p  and  (M,  w)  \=  ip' 

(M,  w)  (=  -up  iff  (M,  w)  \f=  ip 

(M,  w)  )=  Kiip  iff  (M,  w')  \=  p  for  all  w'  G  /C i(w) 

(M,  w)  f=  Cp  iff  (M,  w)  |=  Ekp  for  all  k  >  1 

4Bonanno  and  Neliring  [1999]  assume  only  that  the  relation  is  serial,  Euclidean,  and  transitive,  which 
is  a  weaker  assumption  than  it  being  an  equivalence  relation,  because  they  want  to  model  belief  rather 
than  knowledge.  Otherwise,  their  formalism  is  the  same. 
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(M,  w)  (=  aipr^i)  H - V  amPri{<Pm)  >  b 

if  d\  Pr^([^]M  n  +  •  •  •  +  am  Pi  bFu,*)  b. 


The  clause  for  K.t <p  captures  the  intuition  that  Ktcp  is  true  at  world  w  if  <p  is  true  all 
the  worlds  the  agent  considers  possible  at  w ,  namely  JCi(w)\  the  clause  for  Cip  enforces 
the  intuition  that  common  knowledge  is  equivalent  to  everyone  knows,  and  everyone 
knows  that  everyone  knows,  ....  Finally,  the  clause  for  ^-probability  formulas  captures 
the  intuition  that  a  formula  such  as  pn{(p)  +  2 pripip)  >  1  just  says  that,  according  to 
agent  i.  the  probability  of  <p  plus  twice  the  probability  of  ip  is  at  least  1. 

It  should  be  clear  that  this  approach  of  starting  with  formulas  and  associating  events 
with  them  is  not  so  far  removed  from  the  more  standard  approach  in  the  economics 
literature  of  defining  knowledge  in  terms  of  an  operators  Ki, . . . ,  K„  on  events,  where 
K i(E)  =  {w  :  Kipw)  C  E}.  In  particular,  it  is  easy  to  see  that  Kj([<p]M)  =  [TCvtIm- 

For  future  reference,  it  is  useful  to  recall  a  well-known  alternative  characterization 
of  common  knowledge.  We  say  that  world  w'  is  reachable  from  w  if  there  exist  worlds 
Wq.  ... .  wrn  such  that  w  =  u>0,  w'  =  w.m  and  for  all  k  <  m,  there  exists  an  agent  j  such 
that  Wk- hi  £  JCj(wk )•  Let  C(w)  consist  of  all  the  worlds  reachable  from  w ;  C(w)  is  called 
the  component  of  w.  The  reachability  relation  is  clearly  an  equivalence  relation;  thus,  C 
partitions  the  set  W  of  worlds  into  components.  A  subset  W1  C  IF  is  a  component  of  W 
if  W'  =  C(w)  for  some  w  e  IF. 

The  following  lemma  is  well  known  (cf.  [Fagin,  Halpern,  Moses,  and  Vardi  1995, 
Lemma  2.2.1]). 

Lemma  2.1:  (M,  w)  j=  Cip  iff  (M,  w1)  ]=  <p  for  all  w1  £  C(w). 

With  this  background,  we  can  formalize  the  CPA.  It  is  simply  another  constraint  on 
probability  assignments. 

CP.  There  exists  a  probability  space  (IF,  XW:Prw)  such  that  PrM/(IF/)  >  0  for  all 
components  W'  of  IF  and  for  all  i,  w,  if  VI Zi(w)  =  ( Ki(w ),  XWji,  Pr^^),  then  Xw;i  C 
Xw  and,  if  Pr w(ICi(w))  >  0,  then  Pr Wti(U)  =  Piw(U\K,i(w))  for  all  U  £  XWji. 
(There  are  no  constraints  on  Pr^  j  if  Pr w(Ki(w))  =  0.) 

This  formalization  of  the  CP  is  slightly  different  from  the  others  in  the  literature. 
Bonanno  and  Nehring  [1999],  Feinberg  [2000],  and  Samet  [1998]  do  not  require  the  con¬ 
dition  that  the  prior  gives  each  component  positive  probability.  However,  this  condition 
is  necessary  for  Aumann’s  theorem  to  hold;  see  Example  2.3.  Aumann  [1976,  1987]  starts 
with  the  prior  and  assumes  that  the  posteriors  are  obtained  from  the  prior  by  condition¬ 
ing  on  the  information  of  the  agents;  in  our  language  this  means  that  PrUI)j  is  obtained 
from  Prw  by  conditioning  on  K, i(w).  In  [Aumann  1976],  Aumann  explicitly  assumes 
that  Pr w(JCi(w))  7^  0  for  all  agents  i  and  worlds  w.  (This  assumption  is  also  implicitly 
made  in  [Aumann  1987].)  While  the  issue  of  what  happens  when  the  prior  gives  an 
information  set  zero  probability  is  a  relatively  minor  technical  nuisance,  it  turns  out  to 
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play  an  important  role  when  considering  the  impact  of  the  CPA.  As  mentioned  in  the 
Introduction,  Lipman  [1997]  shows  that  there  are  still  some  consequences  of  the  CPA 
even  without  common  knowledge  in  the  language.  However,  as  shown  here,  the  assump¬ 
tion  that  ¥YW(Ki(w))  ^  0  for  all  i,w  is  crucial  for  Lipman’s  results.  With  the  weaker 
assumption  that  only  components  need  get  positive  probability,  there  are  in  fact  no  con¬ 
sequences  of  the  CPA  without  common  knowledge  in  the  language.  This  is  discussed  in 
more  detail  in  Section  3. 

The  CPA  is  far  from  a  weak  assumption,  as  the  following  example  shows. 

Example  2.2:  Consider  the  frame  F\  described  in  Figure  1.  There  are  four  worlds; 

Agent  2 


1/3 

2/3 

• 

• 

wi 

W2 

1/2 

1/2 

2/3 

1/3 

• 

• 

W3 

W4 

1/2 

1/2 

Figure  1:  A  frame  that  does  not  satisfy  CP. 

the  partition  induced  by  has  the  equivalence  classes  {wi:w2}  and  {uqpuq},  and  the 
partition  induced  by  1C2  has  the  equivalence  classes  {uq,u>3}  and  {uqpuq}.  Whatever 
two  worlds  agent  1  considers  possible,  he  ascribes  them  both  probability  1/2.  Agent  2, 
however,  thinks  that  w3  is  twice  as  likely  uq  and  w2  is  twice  as  likely  as  u>4.  It  is  easy  to 
see  that  F1  cannot  satisfy  CP.  | 

Let  Tn  consist  of  all  frames  for  n  agents.  Let  Fffi  consist  of  all  frames  for  n  agents 
where  the  set  of  worlds  is  finite  and  the  probability  spaces  at  each  point  are  such  that 
every  set  is  measurable.  Let  T^p  (resp.,  Ffip,^n)  consist  of  all  frames  in  Tn  (resp.,  Ffin) 
that  satisfy  CP.  I  use  A4n,  A4fin,  Mfip  1  and  Mfip,fin  to  denote  the  corresponding  sets  of 
structures.5 

A  formula  <p  is  valid  (resp.,  satisfied )  in  a  Kripke  structure  M  =  (IF, . . .)  if  for  all 
(resp.,  some)  w  £  W,  we  have  (. M,w )  |=  ip.  A  formula  is  valid  (resp.,  satisfied)  in  frame 
F  if  it  is  valid  in  every  Kripke  structure  (resp.,  satisfied  in  some  Kripke  structure)  based 
on  F.  A  formula  <p  is  valid  in  a  set  M.  of  structures  (resp.,  set  T  of  frames)  if  it  is  valid 
in  every  structure  M  £  M.  (resp.,  every  frame  F  £  T).  It  is  easy  to  check  that  a  formula 
is  valid  in  a  set  T  of  frames  iff  it  is  valid  in  the  set  M.  of  all  structures  based  on  the 
frames  in  T . 

technically,  these  are  not  sets  but  classes ;  they  are  too  large  to  be  sets.  I  ignore  the  distinction  here. 
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To  the  extent  that  there  has  been  consideration  of  formulas  and  structures  that  satisfy 
them  in  the  economics  literature,  the  focus  has  been  on  what  has  been  called  the  canonical 
structure  or  canonical  model.  This  is  essentially  a  universal  structure,  which  has  the 
property  that  if  a  formula  is  satisfiable  at  all,  it  is  satisfied  at  some  world  in  the  canonical 
structure.  This  was  introduced  in  the  economics  literature  by  Aumann  [1989],  although 
the  basic  idea  is  well  known  in  the  modal  logic  community,  and  seems  to  have  been 
introduced  independently  by  Kaplan  [1966],  Makinson  [1966],  and  Lemmon  and/or  Scott 
[Lemmon  1977].  The  canonical  model  has  the  property  that  every  structure  can  be 
embedded  in  it,  in  a  precise  sense. 

This  may  suggest  that  all  we  need  to  consider  is  the  canonical  model.  While  a  case 
for  this  can  be  made  if  we  do  not  have  common  knowledge  in  the  language,  the  canonical 
model  construction  fails  if  we  add  common  knowledge  to  the  language,  because  of  the 
inhnitary  nature  of  common  knowledge  (see  [Fagin,  Halpern,  Moses,  and  Vardi  1995, 
Section  3.3]).  But  even  ignoring  this  issue,  there  are  advantages  in  considering  models 
other  than  the  canonical  model,  with  its  uncountable  state  space.  If  we  are  analyzing 
a  simple  game,  we  are  clearly  far  better  off  conducting  the  analysis  using  a  model  that 
reflects  that  game.  In  any  case,  for  the  results  in  this  paper,  it  is  useful  to  consider  not 
just  the  canonical  model,  but  the  spaces  of  structures  and  frames  introduced  above. 

Aumann’s  [1976]  theorem  tells  us  that  for  all  a  and  6,  the  formula  -*C{pri{<p)  = 
a  Apr2(</?)  =  b)  is  valid  in  M.  if  a  ^  b:  agents  cannot  agree  to  disagree  in  the  presence 
of  a  common  prior.  It  is,  however,  not  valid  in  A42-  In  fact,  if  Mi  is  a  structure 
based  on  the  frame  F\  of  Example  2.2  where  p  is  true  at  w2  and  u>4,  then  we  have 
C(pri(p)  =  1/2  A  pr2(p)  =  2/3  is  valid  in  M4.  The  requirement  that  the  common  prior 
give  each  component  positive  measure  is  necessary  for  Aumann’s  result,  as  the  following 
example  shows:6 

Example  2.3:  Consider  the  structure  M  =  (IF,  JCi,  /C2,  V1Z\,  T1Z2: 7r)  described  in  Fig¬ 
ure  2,  where  IF  =  {wi:w2,ws}  and  the  partitions  induced  by  K\  and  /C2  are  the  same; 
the  equivalence  classes  are  {wi,w2}  and  {uq}.  Agents  1  and  2  disagree  about  the  prob- 


Agent  1 
Agent  2 


2/3 

1/3 

1 

• 

• 

• 

w. 

w 

W 

1 

2 

3 

1/2 

1/2 

1 

Figure  2:  A  structure  with  disagreement  in  probability  in  one  component. 

abilities  in  the  first  component.  According  to  agent  1,  w\  gets  probability  2/3  (so  w2  get 
probability  1/3);  according  to  agent  2,  w\  and  w2  get  equal  probability.  Suppose  7r  is 

6I  think  Bart  Lipman  for  bringing  this  issue — and  this  example — to  my  attention. 
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such  that  p  is  true  at  w\  and  false  at  the  other  two  worlds.  Then  it  is  easy  to  see  that 
(M,w i)  (=  C(pri(p)  =  2/3  A  pr2(p)  =  1/2),  so  we  have  a  disagreement  in  probability. 
However,  if  we  drop  the  requirement  in  CP  that  Pr w  must  give  each  component  posi¬ 
tive  probability,  then  there  would  be  a  common  prior  in  this  case:  it  would  assign  u>3 
probability  1  and  the  other  two  worlds  probability  0.  I 


3  Characterizing  the  CPA 

In  this  section,  I  consider  two  approaches  to  characterizing  the  CPA.  The  first  is  in  the 
spirit  of  the  approaches  taken  in  the  economics  literature  (although  it  has  analogues  in 
the  modal  logic  literature  too),  while  the  second  involves  finding  a  sound  and  complete 
axiomatization.  In  Section  4,  I  discuss  in  more  detail  what  the  definitions  tell  us,  in  light 
of  the  results. 

3.1  Frame  Distinguishability 

Frame  distinguishability  essentially  asks  whether  there  is  a  test  (expressed  as  a  set  of 
formulas)  that  allows  us  to  distinguish  the  frames  satisfying  a  certain  property  from  ones 
that  do  not. 

Definition  3.1:  A  set  A  of  formulas  distinguishes  a  collection  T  of  frames  from  another 
collection  T'  if  (a)  every  formula  in  A  is  valid  in  T  and  (b)  if  F  e  T' ,  then  some  formula 
in  A  is  not  valid  in  F.  I 

Typically  the  set  A  of  formulas  in  Definition  3.1  consists  of  all  instances  of  some  axiom 
and  the  set  T  is  the  set  of  frames  satisfying  a  certain  property  (like  the  CPA).  Note 
that  this  definition  is  given  in  terms  of  frames,  not  structures;  this  is  necessary  for  the 
technical  results  to  hold. 

My  results  on  frame  distinguishability  parallel  those  of  Feinberg  [2000]:  we  cannot 
distinguish  frames  that  satisfy  the  CPA  from  those  that  do  not,  but  we  can  distinguish 
finite  frames  satisfying  the  CPA  from  those  that  do  not.  To  do  this,  we  might  hope 
to  use  the  axiom  characterizing  Aumann’s  “no  disagreement”  theorem,  -i C^prfiip)  = 
a  Apr j(ip)  =  b )  for  a/5.  While  this  axiom  in  valid  in  Tffp  (and  hence  Tfip^n),  it  is  not 
strong  enough  to  distinguish  TfiF'^n  from  Tffn  —  tFffp,fin.  As  Feinberg  [2000]  points  out, 
there  are  frames  in  Tff1  —  Tfip'^n  that  satisfy  every  instance  of  this  axiom,  simply  because 
C(pr i{jp)  =  a)  does  not  hold  for  any  choice  of  a.  For  example,  if  we  slightly  modify  the 
probabilities  in  the  frame  F\  of  Example  2.2  (for  example,  changing  agent  2’s  probability 
so  that  the  probability  of  uq  is  2/3  +  e  for  some  small  e  (so  that  the  probability  of  w\  is 
1/3  —  e),  then  the  only  formulas  for  which  agent  2’s  probabilities  are  common  knowledge 
are  true  and  false.  Thus,  ->C(pri(cp)  =  aApr2(p)  =  b)  for  a  /  6  trivially  holds.  It  follows 
that  we  need  something  stronger  than  disagreement  in  probability  to  characterize  the 
CPA. 
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Consider  the  following  axiom. 


CP2.  If  <pi, . . . ,  ifm  are  mutually  exclusive  formulas  (that  is,  if  -1  (<pi  A  ipj)  is  an  instance 
of  a  propositional  tautology  for  i  ]).  then 


~>C(a1pri((pi)  H - h  ampr i(<pm)  >  0  A  a1pr2(ipi)  H - h  ampr2(pm)  <  0). 


Notice  that  CP2  is  really  an  axiom  scheme ;  that  is,  it  represents  a  set  of  formulas,  obtained 
by  considering  all  instantiations  of  a^, . . . ,  am  and  tpi, . . . ,  pm.  CP2  is  valid  in  a  structure 
M  if  it  is  not  common  knowledge  that  agents  1  and  2  disagree  about  the  expected  value 
of  the  random  variable  which  takes  value  aj  on  j  =  1, . . .  ,m.  Intuitively,  CP2 

says  that  it  cannot  be  common  knowledge  that  agents  1  and  2  have  a  disagreement  in 
expectation.  It  is  easy  to  see  that  disagreements  in  expectation  cannot  exist  if  there  is 
a  common  prior;  Feinberg  [1995,  2000]  and  Samet  [1998]  showed  that  the  converse  also 
holds  in  finite  spaces. '  The  following  theorem  just  recasts  their  results  in  this  framework; 
its  proof  shows  why  we  need  to  use  frames  rather  than  structures  in  Definition  3.1. 

Theorem  3.2:  CP2  distinguishes  T^P'^n  from  V2n  —  V2P^n . 

Proof:  See  the  appendix.  | 

The  proof  of  Theorem  3.2  shows  that  if  F  is  a  finite  frame  that  does  not  satisfy 
the  CPA,  there  is  a  single  instance  <p  of  CP2  which  is  not  valid  in  F.  Since  this  is  an 
epistemic  formula,  the  agents  both  know,  at  a  given  world  in  F,  that  p  is  not  valid  in  F. 
Intuitively,  this  means  that  not  only  can  the  modeler  distinguish  F  from  structures  that 
satisfy  the  CPA,  so  can  the  agents. 

Does  this  mean  that  the  agents  in  a  given  a  finite  frame  F  can  tell  if  F  satisfies  the 
CPA?  Given  infinite  time  and  computational  resources,  yes.  They  simply  check  each  of 
the  (countably  many)  instances  of  CP2  to  see  if  they  are  all  valid  in  F.  If  all  of  them 
are,  then  F  satisfies  the  CPA;  if  not,  then  F  does  not  satisfy  the  CPA.  This  approach 
is  obviously  not  feasible  in  practice.  There  is  a  better  approach:  given  F,  as  shown  by 
Samet  [1998],  a  prior  is  compatible  with  agent  V s  posteriors  in  F  iff  it  is  in  the  convex 
hull  of  the  probability  measures  VI Zi(w),  for  the  worlds  w  €  F.  Standard  techniques 
of  computational  geometry  can  be  used  to  compute  the  convex  hull  efficiently  for  each 
agent  i  and  to  check  if  the  two  convex  sets  thus  obtained  are  disjoint  (see,  for  example, 
[Cormen,  Leiserson,  and  Rivest  1990]).  F  satisfies  the  CPA  iff  the  convex  hulls  are  not 
disjoint. 

As  Feinberg  and  Samet  show,  we  can  extend  this  characterization  of  the  CPA  in  the 
case  of  two  agents  to  n  >  2  agents.  Consider  the  following  axiom: 

7Essentially  the  same  result  is  proved  by  Bonanno  and  Nehring  [1999],  but  they  were  dealing  with 
belief  rather  than  knowledge,  so  rather  than  being  equivalences,  their  K.t  relations  were  serial,  Euclidean, 
and  transitive. 
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CPn.  If  ipi, . . . ,  ( pm  are  mutually  exclusive  formulas  and  atJ ,  i  =  1, . . . ,  n,  j  =  1 , pm, 
are  rational  numbers  such  that  Yft=i  an  =  0,  for  j  =  1, . . . ,  m,  then 

->C(aupri((pi)  H - b  aimPrii^m)  >  0  A  ...  A  aniprn(</?i)  H - b  anmprn((pm)  >  0). 


It  is  easy  to  see  that  CP2  is  equivalent  to  the  axiom  that  results  from  CPn  above  when 
n  =  2,  so  I  take  the  liberty  of  abusing  notation  and  denoting  both  as  CP2. 

The  following  result  generalizes  Theorem  3.2;  its  proof  is  omitted,  since  it  follows 
from  results  of  Feinberg  and  Samet  in  the  same  way  as  Theorem  3.2. 

Theorem  3.3:  CPn  distinguishes  Fffp,fin  from,  Tff1  —  Fffp'fin ,  for  all  n  >  2. 

What  happens  if  the  set  of  worlds  is  not  finite?  Feinberg  shows  by  example  that  we 
can  find  structures  for  which  there  is  no  common  prior,  and  yet  there  is  no  disagreement 
in  expectation  (at  least,  not  by  bounded  random  variables).  His  counterexample  can 
also  be  used  to  show  that  CP2  does  not  distinguish  T%p  from  JF2  —  T%p .  I  give  his 
counterexample  here  (actually,  a  simplification  of  it,  which  suffices  for  my  purposes), 
since  it  will  be  needed  to  prove  the  next  theorem. 


Example  3.4:  Let  F*  =  (W,  /C2,  VlZi,  VTZf)  be  the  frame  described  in  Figure  3: 

W  =  {wi,  u>2, . . /Ci  induces  the  partition  {{wi},  {w2,  w3}:  {w4,  w5}, . . .}  and  /C-2  in¬ 
duces  the  partition  {{wi,  u>2},  {u>3,  u>4}, . . VlZi  and  V1Z-2  are  as  described  in  the  figure. 
As  the  figure  shows,  both  agents  think  that  all  the  worlds  they  consider  possible  at  each 
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1/2 

1/2 
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1/2 
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t 
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Agent  2 

W! 

1/2 
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2 

1/2 

W3 

1/2 

w 

1/2 

w5 

1/2 

1/2 

1/2 

w1 

111 


Figure  3:  A  frame  that  satisfies  CP2,  but  not  the  CPA. 


world  are  equally  likely  (which  means  that  they  have  probability  1/2  except  in  the  case 
of  agent  1  at  worlds  u>i). 

It  is  easy  to  see  that  there  is  no  common  prior  in  F*.  For  suppose  that  Pr^  is  such 
a  common  prior.  To  get  all  the  conditional  probabilities  to  work  out,  we  must  have 
Prw(?zh)  =  Prw(w2)  =  Prjv(w3)  =  . . .,  and  this  is  clearly  impossible;  there  is  no  uniform 
distribution  on  a  countable  set.8 

8There  is  a  common  improper  prior  on  W,  namely,  the  uniform  measure,  which  assigns  measure  1  to 
each  world  (and  measure  00  to  every  infinite  set,  including  W).  We  might  hope  for  a  characterization  of 
the  CPA  in  infinite  spaces  using  common  improper  priors.  However,  it  is  not  hard  to  show  that  agents 
with  a  common  improper  (even  uniform)  prior  can  disagree  about  the  expectation  of  a  bounded  random 
variable,  so  the  obvious  characterization  does  not  work. 
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On  the  other  hand,  suppose  that  there  exist  mutually  exclusive  formulas  p\, . . . ,  pm 

such  that  Ci(aipri(ipi)  H - b  ampri(pm)  >  0  A  a1pr2(pi)  + - b  ampr2(pm )  <  0)  is 

satisfied  in  some  structure  M*  based  on  T .  This  means  that  ( M*,Wj )  |=  aipri(pi)  +  ■  •  •  + 
(impriipm)  >  0  A  ciipr2(pi)  +  •  — b ampr2(pm)  <  0  for  all  j.  Since  pt  and  pj  are  mutually 
exclusive,  we  must  have  that  O  =  0  if  J  ^  J.  Suppose,  without  loss  of 

generality,  that  pi, ... ,  prn  are  ordered  so  that  minjfc  :  wk  G  <  min{fc  :  wk  G 

IpjU*}  if  i  <  j.  Note  that  this  means  that  either  w\  ^  for  all  j  or  w\  G  [<Pi]m*- 

Since  (M* ,  wi)  |=  aipri(pi)  +  •  •  •  +  cimpri(pm)  >  0,  we  must  have  that  wk  G  ['PiJm*  and 
that  ai  >  0.  Since  (M*,wi)  |=  aipr2(pi)  +  •  •  •  +  ampr2(pm)  <  0,  we  must  have  w2  G 
[^Im*  and  a2  <  —a\.  An  easy  induction  now  shows  that  (a)  Wk  G  (b)  |a*,|  > 

|ajfe_i|  for  k  =  2, . . . ,  m,  and  (c)  aj  alternates  in  sign  for  j  =  1, . . » 3  k.  Now  suppose  that  m 
is  even  (a  similar  argument  works  if  m  is  odd).  In  that  case,  (wm)  =  {wm,  wm+i}  and 
am  is  negative.  It  follows  that  (M* .  wm )  |=  aipri((pi)  +  -  •  -+ampri(pm)  <  0,  contradicting 
our  original  assumption.  Thus,  every  instance  of  CP2  holds  in  M* .  | 

Example  3.4  shows  that  CP2  does  not  distinguish  T%p  from  T2  —  .  since  every 

instance  of  CP2  is  valid  in  F*  G  F2  —  T%p ■  We  might  hope  to  find  a  richer  set  of  formulas 
that  does  allow  us  to  distinguish  Ffp  from  F2  —  :  the  following  theorem  shows  that 

we  cannot. 

Theorem  3.5:  For  all  k  >2,  there  is  no  set  Ak  of  formulas  in  Cf’c,pr  that  distinguishes 
FkCP  from  Tk  -  FkCP . 

Proof:  See  the  appendix.  | 

The  key  step  in  the  proof  of  Theorem  3.5  involves  showing  that  every  formula  that  is 
valid  in  F.fp  is  valid  in  the  frame  F*  of  Example  3.4.  Proving  this  requires  a  characteri¬ 
zation  of  the  formulas  that  are  valid  in  Tffp :  that  is  the  subject  of  the  next  section. 

3.2  A  Sound  and  Complete  Axiomatization  of  the  CPA 

The  more  standard  approach  to  characterizing  a  notion  like  the  CPA  in  the  logic  com¬ 
munity  is  via  a  sound  and  complete  axiomatization.  An  axiom  system  AX  consists  of  a 
collection  of  axioms  and  inference  rules.  An  axiom  is  a  formula,  and  an  inference  rule 
has  the  form  “from  p\.....pk  infer  g/>,”  where  . . . ,  pk-,  are  formulas.  Typically  (and, 
in  particular,  in  this  paper),  the  axioms  are  all  instances  of  axiom  schemes.  Thus,  for 
example,  an  axiom  scheme  such  as  Kyp  =^>  <p  defines  an  infinite  collection  of  axioms,  one 
for  each  choice  of  <p.  A  proof  in  AX  consists  of  a  sequence  of  formulas,  each  of  which  is 
either  an  axiom  in  AX  or  follows  by  an  application  of  an  inference  rule.  A  proof  is  said 
to  be  a  proof  of  the  formula  p  if  the  last  formula  in  the  proof  is  p.  We  say  p  is  provable 
in  AX,  and  write  AX  h  p,  if  there  is  a  proof  of  p  in  AX;  similarly,  we  say  that  p  is 
consistent  with  AX  if  ->p  is  not  provable  in  AX. 
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An  axiom  system  AX  is  said  to  be  sound  for  a  language  C  with  respect  to  a  set  Ad  of 
structures  if  every  formula  in  C  provable  in  AX  is  valid  with  respect  to  every  structure 
in  M..  The  system  AX  is  complete  for  C  with  respect  to  M.  if  every  formula  in  C  that 
is  valid  with  respect  to  every  structure  in  M.  is  provable  in  AX.  We  think  of  AX  as 
characterizing  the  class  M.  if  it  provides  a  sound  and  complete  axiomatization  of  that 
class.  Soundness  and  completeness  provide  a  connection  between  the  syntactic  notion  of 
provability  and  the  semantic  notion  of  validity.9 

In  [Fagin  and  Halpern  1994],  a  complete  axiomatization  is  provided  for  the  language 
C^'pr  with  respect  to  A4n.  The  axiom  system  can  be  modularized  into  five  components: 
axioms  for  propositional  reasoning,  axioms  for  reasoning  about  knowledge,  axioms  for 
reasoning  about  linear  inequalities  (since  ^-probability  formulas  are  basically  linear  in¬ 
equalities),  axioms  for  reasoning  about  probability,  and  axioms  for  combined  reasoning 
about  knowledge  and  probability,  forced  by  assumptions  Al  and  A2.  Let  AX^,pr  consist 
of  the  following  axioms  and  inference  rules,  where  i  €  {1, . . . ,  n}: 

I.  Propositional  Reasoning 

Prop.  All  instances  of  propositional  tautologies. 

Rl.  From  p  and  p  =>■  ip  infer  -0. 

II.  Reasoning  About  Knowledge 

Kl.  (Ki<p  A  Ki(ip  =h  L))  =h 
K2.  Kt (p  p. 

K3.  Kip  =>•  KtKtp. 

K4.  “i Kpp  =>■  Ki^Kip. 

RK.  From  p  infer  K,p. 

III.  Axioms  for  reasoning  about  linear  inequalities 

11.  {appr i{p\)  -\ - b amprt(pm)  >  b)  (aipri(pi)-\ - \-amprz(pm)  +  Opri(pk+1)  >  b ). 

12.  (aipri(tpi)  +  •••  +  ampr,(pm )  >  b)  (a^pr^pj J  + - b  ajrnpri(pjrn )  >  6),  if 

j i,  •  •  • ,  jm  is  a  permutation  of  1, . . . ,  m. 

9  One  could  similarly  define  the  notion  of  a  sound  and  complete  axiomatization  with  respect  to  a  set 
of  frames.  Invariably,  an  axiom  system  is  sound  and  complete  with  respect  to  a  set  of  structures  iff  it  is 
sound  and  complete  with  respect  to  the  corresponding  set  of  frames,  since  a  formula  is  valid  with  respect 
to  a  frame  iff  it  is  valid  with  respect  to  all  the  structures  based  on  it.  Thus,  for  simplicity,  I  focus  only 
on  structures  here. 
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13.  (aipn{<Pi)  H - 1-  amPri(<Pm)  >  b)  A  (a^pr^ipi)  H - h  a'mprz(pm)  >  b')  =f> 

(fli  +  a'i)PA(Vi)  H - 1-  («m  +  <Jpr8(</9m)  >  (6  +  &'). 

14.  (aipri(ipi)  H - h  ampn(pm)  >  b)  44  (cipr^i)  H - b  cmpn(pm)  >  dfe)  if  d  >  0. 

15.  (aipr^i)  H - b  ampri(pm )  >  6)  V  (aipr^i)  H - b  amprz(pm)  <  b). 

16.  {appr^pi)  H - b  amprz{pm)  >  b)  =4>  (dip^Oi)  H - b  amprz{pm)  >  If)  if  6  > 

IV.  Reasoning  about  probabilities 

PI.  pri{<p)  >  0. 

P2.  pn(true )  =  1. 

P3.  pa(</3  A  -0)  +  PA(<p  A  -i^>)  =  pri(<p). 

RP.  From  p  ip  infer  pri(p)  =  pri(ip).10 

V.  Reasoning  about  knowledge  and  probabilities 

KP1.  Ki(ip)  =b  pri(p)  =  1. 

KP2.  p  =b  Ki<p,  if  p  is  an  ^-probability  formula  or  the  negation  of  an  ^-probability  formula. 

The  axioms  and  rules  for  propositional  reasoning  and  reasoning  about  knowledge  to¬ 
gether  give  the  standard  complete  axiomatization  for  knowledge  [Fagin,  Halpern,  Moses, 
and  Vardi  1995].  The  axioms  and  rules  for  reasoning  about  inequalities  and  reasoning 
about  probability  are  taken  from  [Fagin,  Halpern,  and  Megiddo  1990],  where  it  is  shown 
that,  together  with  the  the  propositional  component,  they  give  a  complete  axiomatization 
for  reasoning  about  probability.  Note  that  P3  essentially  captures  finite  additivity.  Al¬ 
though  our  probability  measures  are  countably  additive,  there  is  no  axiom  for  countable 
additivity.  This  is  essentially  because  the  language  is  too  weak  to  capture  this  inherently 
infinitary  property. 

What  happens  when  we  add  common  knowledge  to  the  language?  It  is  well  known 
[Fagin,  Halpern,  Moses,  and  Vardi  1995;  Halpern  and  Moses  1992]  that  adding  the  fol¬ 
lowing  to  the  axioms  and  rules  for  knowledge  gives  a  complete  axiomatization  for  the 
language  of  knowledge  and  common  knowledge:11 

VI.  Reasoning  About  Common  Knowledge 

10In  [Fagin  and  Halpern  1994],  this  inference  rule  is  stated  as  the  axiom  prRv)  =  pri( ip)  if  p  <£>  if) 
is  a  propositional  tautology.  We  need  the  more  general  inference  rule  to  prove,  for  example,  that 
pri(Kjtp)  =  pri(KjKjp). 

11In  [Fagin,  Halpern,  Moses,  and  Vardi  1995;  Halpern  and  Moses  1992]  there  is  also  an  axiom  that  says 
Eip  Kitp  A ...  A  Kn(p.  This  axiom  is  unnecessary  here  because  I  have  taken  Eip  to  be  an  abbreviation 
(whose  definition  is  given  by  the  axiom),  rather  than  taking  FI  to  be  a  primitive  operator. 
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Cl.  C<p  E((p  A  C<p). 

RC.  From  ip  ^  E(<p  A  ip)  infer  ip  =>  Cip. 

Let  AX^,c,pr  be  the  system  consisting  of  the  axioms  and  rules  of  AXp'pr  together 
with  Cl  and  RC. 

Theorem  3.6:  AXp,c,pT  is  a  sound  and  complete  axiomatization  for  Cp,c,pr  with  respect 
to  both  A4n  and  Ai^n  (and  hence  also  with  respect  to  both  Tn  and  T^n). 

Proof:  The  proof  is  a  straightforward  (although  lengthy  and  tedious)  combination  of 
the  techniques  of  [Fagin  and  Halpern  1994]  and  [Halpern  and  Moses  1992],  The  result  is 
actually  proved  in  the  course  of  proving  Theorem  3.8.  I 

It  is  worth  noting  that,  although  common  knowledge  is,  in  a  sense,  an  infinitary  notion 
(that  is,  C  can  be  defined  in  terms  of  an  infinite  conjunction  of  formulas  involving  the 
AYs),  it  can  be  characterized  using  a  hnitary  axiom  and  inference  ride — Cl  and  RC. 

AX^,c,pr  is  not  a  sound  and  complete  axiomatization  for  C(p,c,pr  with  respect  to  A4£p 
and  Ai((p^n .  If  we  restrict  to  structures  that  satisfy  the  CPA,  we  get  new  valid  formulas. 
Indeed,  as  we  have  already  seen,  every  instance  of  CPn  is  valid  in  A4£p  (and  hence 
In  light  of  Theorem  3.3,  we  might  hope  that  if  we  add  CPn  to  AXp,c,pr:  this 
would  give  us  a  sound  and  complete  axiomatization,  at  least  for  M.^p,^n.  Unfortunately, 
this  is  not  the  case. 

To  understand  why,  some  background  is  helpful.  Samet  [1998]  shows  that,  given  a 
frame,  the  set  of  possible  priors  for  agent  i  (i.e. ,  those  that  can  generate  the  posteriors 
defined  by  Pr^j)  forms  a  closed  convex  set.  If  two  agents  do  not  have  common  prior, 
the  corresponding  sets  of  possible  priors  must  be  disjoint.  He  then  makes  use  of  a 
standard  result  of  convex  analysis  [Rockafellar  1972]  to  conclude  that  these  sets  can 
be  strictly  separated  by  a  hyperplane.  The  separating  hyperplane  gives  the  coefficients 
cii, . . . ,  am  in  CP2.  That  is,  strict  separation  by  a  hyperplane  amounts  to  a  disagreement 
in  expectation. 

If  we  consider  the  set  of  priors  compatible  with  a  given  formula,  it  is  no  longer 
necessarily  a  closed  set,  so  Samet’s  argument  does  not  quite  work.  For  example,  let  <pi, 
<P2 1  and  7b  be  the  three  mutually  exclusive  formulas  p  A  q,  p  A  -> q,  and  -i p,  respectively. 
Let  ipi  be  (pr i(<pi)  >  pn (<p2))  V  ((pr i(<pi)  =  pn (<p2))  A  {pri(<p3)  >  1/2))  and  ip2  be 
0w2 (Vi)  <  pr 2(^2))  V  ((pr2(<pi)  =pr2(<p2))  A  (pr2(<p3)  <  1/2)). 12 

Let  X1  consist  of  all  prior  probability  distributions  for  agent  i  that  satisfy  ipi,  i  =  1,2. 
Then  X 1  =  {(rci,  x2,  x3)  :  x\  >  x2  or  x\  =  x2:x3  >  1/2}  (where  x.t  is  the  probability  of 
(Pi,  i  =  1,2,3)  and  X2  =  {(xi,  x2,  x3)  :  x\  <  x2  or  x\  =  x2:x3  <  1/2}.  X1  and  X2  are 
easily  seen  to  be  disjoint.  Thus,  there  cannot  be  a  common  prior.  However,  although  X\ 
and  X2  are  convex,  they  are  not  closed;  it  is  easy  to  show  that  they  cannot  be  strictly 

12This  example  was  suggested  by  Dov  Samet. 
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separated  by  a  hyperplane,  and  we  do  not  have  disagreement  in  expectation  in  the  spirit 
of  CP2.  As  a  consequence,  we  get  the  following  theorem. 

Theorem  3.7:  The  formula  -iC(zpi  A  ip2)  is  valid,  in  Xi2p ,  but  is  not  provable  in  the 
system,  AX^,c,pr  +  CP2. 

It  follows  from  Theorem  3.7  that,  if  we  are  to  obtain  a  completeness  result,  even  in 
the  case  of  two  agents,  we  need  something  stronger  than  CP2.  The  key  insight  comes 
from  examining  the  set  X 1  and  X'2  in  this  counterexample  again.  For  all  (aq,  x2,  x3)  €  X1 
and  (2/1, 2/2, 2/3)  G  -A2,  we  have 


x\  -  x2  >  0  >  2/1  -  2/2  and  x±  -  x2  =  yi  -  y2  =>  {x3  -  xi  -  x2)  >  0  >  (y3  -  yx  -  y2). 

This  example  generalizes.  More  precisely,  any  two  disjoint  convex  (but  not  necessarily 
closed)  sets  X0o  and  X10  can  be  separated  in  expectation  in  the  following  more  general 
sense.  Let  X0o  and  X10  denote  the  topological  closure  of  X00  and  X10,  respectively  .  If 
X00  and  X10  are  disjoint,  then  they  can  be  strictly  separated  by  a  hyperplane.  If  not, 
then  they  can  be  weakly  separated  by  a  hyperplane  Hi.  Let  X,a  =  Xi0  n  Hi,  for  i  =  0, 1. 
Notice  that  Xi0  and  X,a  are  disjoint,  convex  sets.  Either  Xi0  and  Xu  are  disjoint,  so  they 
can  be  strictly  separated  by  a  hyperplane,  or  they  are  weakly  separated  by  a  hyperplane 
H2.  We  can  continue  in  this  way  to  construct  convex,  disjoint  sets  X,tJ .  i  =  0, 1  for 
j  =  0, 1,  2, . . ..  For  sufficiently  large  j.  their  closures  must  be  disjoint,  and  hence  strictly 
separable  by  a  hyperplane.  This  is  made  precise  in  Lemma  A. 3  in  the  appendix,  and 
generalized  to  more  than  two  agents  in  Lemma  A. 4. 

Essentially,  this  observation  tells  us  that  if  the  CPA  holds,  then  two  agents  cannot 
disagree  in  expectation  in  this  more  general  sense.  As  a  consequence,  the  following  axiom 
is  valid. 


CP2.  If  </?i, . . . ,  (fm  are  mutually  exclusive  formulas  and  i*  £  {1,  2},  then 

->C'(  YfJLi  «i jPri(<Pj)  >  0  A  Yfjh  «i. jPr2(pj)  <  0  A 
((TfjLi  fli. jPn(Pj)  =  0  a  E"Li  aijpr2(<Pj)  =  0) 

...  A 

(E”Li  a(h-i)jpri(pj)  >  0  A  EJii  a^-^jpr^pj)  <  0  A 
((Ejl,i  a{h-i)jPri{(Pj)  =  0  A  E"1 1  a{h-i)jpr2(pj)  =  0)  =*► 

(E^=i  ahjPrAVj)  >  0  A  YffU  ahjPr2-i*((Pj )  <  0))) . . .)). 

It  is  easy  to  see  that  the  formula  ~^C(pif>i  A  ip2)  in  Theorem  3.7  follows  from  CP^.  Indeed, 
Theorem  3.8  shows  that  (in  the  presence  of  the  other  axioms),  all  formulas  valid  in  ALfp 
follow  from  CP2. 

Just  as  CP2  generalizes  to  CPn  with  n  agents,  so  we  get  the  following  generalization 
ofCPf,: 
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CP^.  If  (pi,...,(pm  are  mutually  exclusive  formulas,  a^j-,  i  =  1 , . . . ,  n,  j  = 

k  =  1, . . . ,  h,  are  rational  numbers  such  that  Yf(= 1  aikj  =  0,  for  j  =  1, . . . ,  m, 
k  =  1, . . .  ,h,  and  i*  £  {1, . . . ,  n},  then 

-C'(  A"=i(E 7=i  anjPnM  >  0)  A  (A"=i(E ™=i  WC(^)  =  0)  ^ 

...  A 

(A2=i(E^=i  anh-i)jpri(<Pj)  >  0)  a  (Ai*=i(EyLi  ai{h-i)jVri(}Pj)  =  o)  =► 
(E^Li  aphjprrivj)  >  0)  A  A4^*(E"Li  aihjprfiipj)  >  0)))  •  •  •))• 

Although  CP^  is  not  as  elegant  as  we  might  hope,  it  does  the  job.  Let  AXPP  consist 
of  all  the  axioms  and  rides  of  AX^,c,pr  together  with  CP^. 

Theorem  3.8:  AXPP  is  a  sound  and  complete  axiomatization  for  C^'c,pr  with  respect 
to  both  M£p  and  (and  hence  also  with  respect  to  both  JPfp  and  Tf(p^n). 

Proof:  See  the  appendix.  | 

The  fact  that  CP^  in  addition  to  the  other  standard  axioms  suffices  to  characterize 
the  CPA  in  finite  structures  may  not  be  so  surprising  in  light  of  Theorem  3.3.  What 
may  seem  somewhat  surprising  that  there  is  no  difference  between  infinite  structure  and 
finite  structures  in  Theorem  3.8.  The  contrast  with  Theorems  3.3  and  3.5  is  striking; 
they  show  that  there  is  a  big  distinction  between  finite  and  infinite  frames  when  we  try  to 
characterize  the  CPA  in  terms  of  frame  distinguishability.  The  key  point  is  that,  although 
this  language  is  quite  expressive  in  some  ways,  it  is  not  expressive  enough  to  distinguish 
finite  structures  from  infinite  ones.  This  is  made  precise  in  the  following  theorem,  which 
shows  that  if  a  formula  is  satisfiable  at  all,  it  is  satisfied  in  a  finite  structure.  The 
result  actually  follows  from  the  proof  of  Theorem  3.8,  but  I  provide  in  the  appendix 
an  alternative  proof,  using  a  standard  proof  technique  for  proving  such  results  from  the 
modal  logic  literature  known  as  filtration.  Note  that  it  follows  from  the  result  that  finite 
frames  cannot  be  distinguished  from  infinite  frames  (whether  or  not  we  assume  the  CPA) 
either  using  frame  distinguishability  or  complete  axiomatizations. 

Theorem  3.9:  A  formula  in  Cp,c'pr  is  valid  with  respect  to  Adpp  (resp.,  A4n)  iff  it  is 
valid  with  respect  to  Mfp'fin  (resp.,  M^n). 

Proof:  See  the  appendix.  | 


3.3  Restricting  the  Language  to 

What  happens  if  we  drop  the  common  knowledge  operator  from  the  language?  As  I 
mentioned  earlier,  it  is  shown  in  [Fagin  and  Halpern  1994]  that  AXp,pr  provides  a  sound 
and  complete  axiomatization  for  the  language  Cp,pr  with  respect  to  A4n.  Here,  I  show 
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that  it  is  also  a  complete  axiomatization  for  the  language  C(f,pr  with  respect  to  A i%p . 
That  is,  there  are  no  new  consequences  in  the  languages  C(f,pr  that  follow  from  CP. 
Moreover,  restricting  to  finite  structures  does  not  change  anything. 

Theorem  3.10:  AXp,pr  is  a  sound  and  complete  axiomatization  for  Cp,pr  with  respect 
to  both  Mfp  and  M(jp,fin  (and  hence  also  with  respect  to  both  Tf(p  and  Ff:p'fin). 

Proof:  See  the  appendix.  | 

We  do  no  better  with  frame  distinguishability.  Of  course,  we  already  know  from  The¬ 
orem  3.5  that  formulas  in  Cp)C)Pr  cannot  distinguish  arbitrary  (infinite)  frames  satisfying 
the  CPA  from  ones  that  do  not.  But  Theorem  3.3  tells  us  that  we  can  distinguish  finite 
frames  satisfying  the  CPA  from  ones  that  do  not,  using  formulas  that  involve  common 
knowledge.  It  is  almost  immediate  from  Theorem  3.10  that  this  use  of  common  knowledge 
is  necessary.  The  real  point  here  is  that,  since  we  do  not  have  infinite  conjunctions  in  the 
language,  common  knowledge  is  not  definable  in  terms  of  knowledge.  Moreover,  finite 
conjunctions  of  formulas  involving  knowledge  and  probability  do  not  suffice  for  charac¬ 
terizing  the  CPA;  infinite  conjunctions  (particularly,  the  infinite  conjunctions  defined  by 
the  C  operator)  are  necessary. 

Theorem  3.11:  For  all  n,  no  set  A  of  formulas  in  Cp)Pr  distinguishes  Ff(p^'n  from 

pfin  p  CP  y fin 

**  n  *rn 


Proof:  See  the  appendix.  | 

These  results  are  qualitatively  similar  to  those  proved  by  Lipman  [1997],  although 
there  are  nontrivial  technical  differences.  Lipman  shows  that  given  a  structure  M  not 
satisfying  (his  formalization  of)  the  CPA,  a  world  w  in  M,  and  N  >  0,  there  is  a 
structure  MN  that  satisfies  the  CPA  and  world  in  MN  such  that  w  and  w n  agree  on 
all  formulas  of  depth  at  most  N  (where  the  depth  of  a  formula  is  the  depth  of  nesting 
of  the  modal  operators  in  the  language;  thus,  for  example,  K\p  has  depth  1,  K1K2p  and 
Ki(pr\(p )  <  1)  have  depth  2,  and  pr\ (pr2 (Kip)  <  1)  >  1/2  has  depth  3).  On  the  other 
hand,  in  Lipman’s  framework,  there  are  consequences  of  the  CPA  even  without  common 
knowledge  in  the  language.  In  particular,  Lipman  shows  that  agents’  belief  must  be 
weakly  consistent  in  the  sense  that  it  is  impossible  for  agents  to  have  false  beliefs.  For 
example,  given  his  formalization  of  the  CPA,  it  is  impossible  for  agent  1  to  ascribe  positive 
probability  to  the  event  that  p  is  true  but  agent  2  ascribing  probability  0  to  it.  That  is, 
pr\(p  Apr2(p)  =  0)  >  0  is  inconsistent. 

Note  that  this  formula  is  consistent  in  A42p.  Consider  the  structure  described  in 
Figure  4.  There  are  two  worlds,  w\  and  w2.  Agent  2  cannot  distinguish  them  while 
agent  1  can  (so  agent  2’s  partition  has  one  equivalence  class — {»'i,  u-2} — while  agent  l’s 
has  two — {u>i}  and  {iu2}).  Agent  2  ascribes  probability  1  to  w2  and  probability  0  to 
W\.  Obviously,  agent  l’s  probability  at  w\  and  w2  is  determined.  If  p  is  true  at  w\  and 
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Figure  4:  A  frame  satisfying  CP  but  not  CPC 

false  at  u>2,  then  clearly  prffp  A  pr2(p)  =  0)  =  1  is  true  at  w\.  Moreover,  this  structure 
satisfies  CP;  we  can  take  the  prior  to  agree  with  agent  2’s  probability  measure.  Notice 
that  the  common  prior  assigns  K\(w\)  probability  0.  This  is  precisely  what  is  disallowed 
by  Lipman. 

Lipman’s  (slightly  stronger)  version  of  the  CPA  can  be  formalized  as  follows: 

CPC  There  exists  a  probability  space  (W.  Xw ,  Pr^v)  such  that,  for  all  i,  w,  if  VI Zffw)  = 
XWji,  PiWii),  then  XW;i  C  Xw,  Pr w(Ki(w))  >  0,  and  Pr wffU)  =  Piw{U\Ki{w)) 
for  all  U  £  Xw  l. 

Let  XnPS ,  ,  M^p\  and  M^ps,fin  denote  the  sets  of  frames  (resp.,  finite  frames, 

structures,  finite  structures)  for  n  agents  that  satisfy  CPC 

Lipman’s  results  characterizing  the  consequences  of  CPS  in  the  language  C%,pr  can  be 
viewed,  in  terms  of  the  framework  here,  as  a  combination  of  results  regarding  axiomati- 
zations  and  frame  distinguishability.  I  briefly  review  his  results  here  (translated  to  this 
framework) . 

Lipman  first  shows  that  the  language  £ffpr  cannot  distinguish  structures  satisfying 
CPS  from  those  satisfying  the  weaker  common  support  assumption.  A  structure  M  = 
(IF,  /Ci, . . . ,  /C„,  VTZi, . . . ,  Wni  7r )  satisfies  the  common  support  assumption  if  it  satisfies 
the  following  condition: 

CS.  For  all  worlds  w  £  W .  agents  i,j,  and  events  E  C  Ki{w)  C  K j(w),  if  Pr W^(E)  =  0 
then  Pr wj(E)  =  0. 

Let  JFf5  (resp.,  V^s,fin)  consist  of  those  frames  in  Tn  (resp.,  T^n)  that  satisfy  CS.  CS 
is  clearly  weaker  than  CPS  (i.e. ,  T^ps  C  V^s).  Intuitively,  it  holds  as  long  as  i  and  f  s 
priors  assign  probability  0  to  the  same  events,  and  does  not  require  that  they  assign  the 
same  probability  to  all  events.  However,  Lipman  shows  that  the  same  formulas  in  £{Ppr 
are  valid  in  both  sets  of  structures. 

Theorem  3.12:  [Lipman  1997]  For  all  <p  £  C^,pr ,  we  have  T^s  |=  <p  iff  E^ps  \=  p.13 

13Actually,  Lipman  proves  this  result  only  for  countable  frames.  Since,  by  using  the  techniques  of 
Theorem  3.9,  we  can  show  that  a  formula  is  satisfied  in  (resp.,  T^p  )  iff  it  is  satisfied  in  V^s’fin 
(resp.,  the  result  holds  for  arbitrary  frames  as  well. 
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Lipman  further  shows  that  weak  consistency  distinguishes  frames  satisfying  CS  from 
those  that  do  not.  More  precisely,  consider  the  following  axiom: 

WC.  pri{(p  A  pr j{ip)  =  0)  =  0. 

Theorem  3.13:  [Lipman  1997]  WC  distinguishes  from,  Tn  —  ,14 

We  might  at  first  think  that  it  follows  from  Theorems  3.12  and  3.13  that  WC  distin¬ 
guishes  frames  satisfying  CPS  from  those  that  do  not,  but  it  is  easy  to  see  that  this  is 
not  true.  It  is  trivial  to  construct  a  2- world  frame  that  satisfies  WC  but  does  not  satisfy 
CPC  In  fact,  I  conjecture  that  there  are  no  formulas  in  C„,pr  that  can  distinguish  frames 
satisfying  CPS  from  ones  that  do  not,  although  I  have  not  proved  this. 

What  happens  when  we  add  common  knowledge  to  the  language  again?  I  have  not 
examined  this  situation  in  detail,  although  I  conjecture  that  analogues  to  Theorems  3.3, 
3.5,  and  3.8  hold.  Note,  however,  that  we  need  something  stronger  than  CPn  together 
with  WC  to  distinguish  finite  frames  satisfying  CPS  from  those  that  do  not,  as  the 
following  example  shows. 

Example  3.14:  Consider  the  structure  M  described  in  Figure  5.  There  are  four  worlds, 
{u>i,  u>2,  u>3,  w4}.  Agent  l’s  partition  is  {w4,  w2 ,  u>3}  and  {u>4},  while  agent  2’s  is  {wi,  w2}. 
M  clearly  satisfies  CS,  since  both  agents  agree  that  w3  gets  probability  0.  It  also  satisfies 
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Figure  5:  A  frame  satisfying  CS  and  CP,  but  not  CPC 


CP,  since  there  is  a  common  prior  which  gives  u;4  probability  1.  However,  it  does  not 
satisfy  CPC  There  can  be  no  common  prior  that  gives  {wi,w2}  positive  probability. 
Since  M  satisfies  CS  and  CP,  it  satisfies  all  instances  of  CP2  (in  fact,  CP'2)  and  WC. 
Thus,  these  formulas  cannot  distinguish  even  finite  frames  satisfying  CPS  from  ones  that 
do  not.  | 


The  following  strengthening  of  CP2  is  valid  in  TffpS  (and  not  in  the  frame  of  Exam¬ 
ple  3.14): 

14Lipman  actually  considers  structures  rather  than  frames  and  imposes  an  additional  condition  he  calls 
nonredudancy ,  which,  roughly  speaking,  says  that  any  two  worlds  are  distinguishable  by  some  formula. 
By  working  at  the  level  of  frames,  we  avoid  the  need  for  the  nonredundancy  condition. 
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CP^.  If  (pi; ;  (pm  are  mutually  exclusive  formulas,  then 

-^C(aipri(<pi)  H - ampri(<pm)  >  0  A  a1pr2(pi)  H - hampr2(<pm)  <  0 

A-|C'-i(a1pr1(<p1)  H - >  0)). 

Whether  this  formula  (and  its  obvious  generalization  to  n  agents)  suffices  to  distinguish 
finite  frames  satisfying  CPS  from  those  that  do  not  remains  open,  as  does  the  problem 
of  providing  a  sound  and  complete  axiomatization  in  the  language  C^,c,pr  for  frames 
satisfying  CPA 


4  Discussion 

In  this  paper,  I  have  considered  two  different  ways  of  characterizing  the  CPA — by  frame 
distinguishability  and  by  complete  axiomatizations.  The  notion  of  frame  distinguisha- 
bility  is  closer  to  the  notions  typically  used  in  the  economics  community.  If  T  can  be 
distinguished  from  T' ,  that  amounts  to  saying  that  we  have  a  test  that  can  distinguish 
frames  in  T  from  those  in  T' .  That  is  analogous  to  saying  that  we  have  a  test  that 
distinguishes  gold  from  bronze.  Clearly,  whether  or  not  we  have  a  distinguishing  test 
depends  on  how  sharp  our  tools  are.  In  this  context,  “sharpness  of  tools”  amounts  to 
the  expressive  power  of  the  language. 

Having  a  test  that  distinguishes  gold  from  bronze  does  not  mean  we  have  a  complete 
characterization  of  the  properties  of  gold.  But  what  is  a  “complete  characterization”  of 
gold?  Does  it  suffice  to  talk  about  its  molecular  structure,  or  do  we  also  have  to  mention 
its  color  and  the  fact  that  it  glitters  in  the  sun?  It  should  be  clear  that  the  notion  of 
“complete  characterization”  is  language  dependent.  We  have  a  complete  characterization 
of  gold  in  a  given  language  C  if  we  can  describe  everything  that  can  be  said  about  gold 
in  C.  In  general,  having  a  complete  characterization  in  one  language  tells  us  nothing 
about  getting  a  characterization  in  a  richer  language.  For  example,  if  we  have  a  weak 
language,  it  may  be  easy  to  find  a  complete  characterization,  because  there  are  not  many 
interesting  properties  of  gold  in  that  language.  That  does  not  give  us  any  hint  of  what 
would  constitute  a  complete  characterization  in  a  richer  language.  (By  way  of  contrast,  if 
we  have  a  distinguishing  test  in  one  language,  the  same  test  works  for  any  more  powerful 
language.) 

We  observed  this  phenomenon  with  the  CPA:  in  the  language  C^’pr ,  there  is  nothing 
interesting  that  we  can  say  about  the  CPA.  There  are  no  new  axioms  over  and  above  the 
axioms  for  reasoning  about  knowledge  and  probability  in  all  structures  (Theorem  3.10). 
Once  we  add  common  knowledge  to  the  language,  there  are  a  great  many  more  interesting 
things  that  can  be  said  about  (structures  satisfying)  the  CPA. 

For  similar  reasons,  we  may  be  able  to  completely  characterize  a  notion  without 
being  able  to  distinguish  frames  that  satisfy  it  from  ones  that  do  not.  Again,  we  saw  this 
phenomenon  with  the  CPA.  We  can  completely  characterize  the  CPA  in  the  language 
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£^,pr  (in  a  not  particularly  interesting  way,  as  Theorem  3.10  shows),  although  C^,pr  is  of 
no  help  in  providing  tests  to  distinguish  frames  satisfying  the  CPA  from  ones  that  do  not 
(Theorem  3.11).  If  we  add  common  knowledge  to  the  language,  then  we  can  distinguish 
finite  frames  satisfying  the  CPA  from  ones  that  do  not  (Theorem  3.3 — this  is  essentially 
the  result  proved  by  Feinberg,  Samet,  and  Bonanno  and  Nehring),  but  cannot  distinguish 
infinite  frames  satisfying  the  CPA  from  those  that  do  not  (Theorem  3.5);  nevertheless, 
we  can  completely  characterize  the  properties  of  (finite  or  infinite)  frames  satisfying  the 
CPA  (Theorem  3.8). 

As  I  observed  in  the  introduction,  the  fact  that  a  language  not  rich  enough  to  provide 
a  distinguishing  test  can  still  completely  characterize  all  the  properties  of  a  notion  of 
interest  is  a  standard  phenomenon  in  logic.  This  leads  to  an  obvious  open  question:  is 
there  a  natural  language  that  is  sufficiently  rich  to  distinguish  infinite  frames  satisfying 
the  CPA  from  ones  that  do  not  (given  only  their  posterior  information).  Note,  however, 
that  such  a  sufficiently  rich  language  may  not  be  axiomatizable. 

In  general,  the  relative  merits  of  one  language  relative  to  another  is  an  issue  that  needs 
to  be  debated.  For  example,  I  have  considered  a  language  with  common  knowledge  here, 
whereas  Feinberg  did  not  consider  a  language  with  common  knowledge.  Is  it  reasonable 
to  add  common  knowledge  to  the  language?  In  general,  there  is  a  tradeoff  between  the 
expressive  power  of  a  language  and  its  complexity.  Enriching  a  language  may  make  it 
easier  to  express  some  notions,  but  in  general  makes  it  harder  to  decide  whether  a  formula 
is  valid.  For  example,  although  there  is  an  algorithm  for  deciding  if  a  formula  is  valid 
whether  or  not  the  language  includes  common  knowledge,  without  common  knowledge 
in  the  language,  the  problem  is  polynomial-space  complete;  with  common  knowledge, 
it  becomes  exponential-time  complete.  (See  [Fagin,  Halpern,  Moses,  and  Vardi  1995, 
Chapter  3]  for  further  discussion  of  these  issues.)  Another  issue  to  be  considered  is  that 
of  axiomatizations.  It  may  be  more  difficult  to  axiomatize  a  richer  language.15  It  is 
typical  in  the  economics  literature  to  define  the  common  knowledge  operator  in  terms 
of  the  knowledge  operator,  leaving  it  out  of  the  language.  The  economics  literature 
is  thus  implicitly  taking  infinite  conjunctions  (actually,  infinite  intersections,  since  in 
economics  there  are  events,  not  formulas)  for  granted.  However,  infinite  conjunctions  are 
not  expressible  in  the  language  £^’pr:  which  allows  only  finite  conjunctions.  Logicians 
have  typically  avoided  inhnitary  languages;  they  typically  require  inhnitary  axioms  and 
rules  of  inference  and  are  difficult  to  deal  with  computationally.  Following  tradition,  I 
have  used  an  explicit  C  operator  rather  than  introducing  infinite  conjunctions.  In  the 
end,  perhaps  the  best  argument  for  including  common  knowledge  here  is  that  the  results 
are  so  much  more  elegant  with  it  than  without  it.  Having  said  that,  it  should  be  clear 
that  the  decision  of  what  to  include  in  the  language  is,  in  general,  not  one  to  be  taken 

15Altliough  this  is  not  necessarily  the  case.  For  example,  it  is  easier  to  give  a  complete  axiomatization 
of  the  logic  of  probability  if  linear  combinations  of  probabilities  are  allowed  than  if  only  comparisons 
of  the  form  pr(tp)  >  a  are  allowed.  The  axiom  P3,  which  captures  the  fact  that  the  probability  of  the 
union  of  two  disjoint  sets  is  the  sum  of  the  individual  probabilities,  cannot  be  expressed  in  a  logic  that 
does  not  allow  linear  combinations. 


22 


lightly. 

To  sum  up,  I  have  tried  to  clarify  here  two  distinct  notions  of  “characterization” .  As 
I  tried  to  indicate  in  the  introduction,  both  have  their  uses.  Frame  distinguishability  is 
perhaps  the  more  appropriate  notion  when  a  frame  is  given;  axiomatizations  are  more 
useful  to  a  modeler  who  is  only  give  some  facts  about  the  frame,  rather  than  a  complete 
description  of  the  frame.  In  any  case,  it  is  important  to  to  be  clear  about  the  differences 
between  the  notions. 


A  Appendix:  Proofs 

The  order  of  proofs  here  is  different  from  the  order  in  which  the  results  are  stated  in  the 
main  text,  since  some  of  the  earlier  theorems  (particularly  Theorem  3.5)  depend  on  some 
of  the  later  results.  The  statements  are  repeated  for  the  convenience  of  the  reader. 

Theorem  3.2:  CP2  distinguishes  0 from,  F2n  —  3 

Proof:  It  is  easy  to  see  that  every  instance  of  CP2  is  valid  in  every  frame  of  T2P^n\ 
this  is  essentially  Aumann’s  [1976]  argument.  I  repeat  his  proof  here  to  make  the  paper 
self-contained,  since  essentially  the  same  idea  is  used  in  a  number  of  other  proofs. 

Suppose  F  E  jrVp-,fin _  m  =  (IF.  /Ci, ... ,  /Cn,  7r)  is  a  structure  based  on  F,  w  E  IF,  and 
<Pi, . . , ,  fm  are  mutually  exclusive.  Suppose  by  way  of  contradiction  that 

(M,w)  \=  C(dipri(<pi)  H - b  cwwhOm)  >  0  A  a1‘pr2{pi)  H - b  ampr2{pm)  <  0). 

Sets  of  the  form  K\  («/)  partition  C(w).  Let  Lq, . . . ,  14  he  a  partition  of  C(w)  into 
sets  of  this  form.  Since  F  E  F2P^n,  there  is  a  common  prior  Piw  on  IF  as  required 
by  CP.  Since  (. M,w )  |=  C'(a1pr1(<^1)  +  •  •  •  +  ampri(pm))  >  0,  it  follows  that  as  long 
as  Pr w(Uj)  >  0,  we  have  that  ai  Pr^d^M  C  Uj)  +  •  •  •  +  am  Prw([v?m]M  n  Uj)  >  0, 

for  j  =  1,  •  •  • ,  k.  (Of  course,  <n  Pr n  Uj)  + - b  am  PrH/([^m]M  D  Uj)  =  0 

if  Pr w(Uj)  =  0.)  Moreover,  we  have  Pr w{\pi\M  0  C{w))  =  E*=i  Pr^d^lM  O  Uj ), 
for  i  =  1  ,...,m.  Since  Pr w(C(w))  >  0,  we  must  have  Pr w(Uj)  >  0  for  some  j,  so 
ai  Pr^([</3i]jvf  fl  C(w))  +  •  — b  am  Pr^d^Jjvf  fl  C(w ))  >  0.  On  the  other  hand,  a  similar 
argument  using  the  fact  that  (M,  w)  j=  C'(a1pr2(<^1)  +  •  — b  ampr2(pm))  <  0,  shows  that 
a i  Priyd^ijM  O  C(w))  +  •  •  •  +  amVrw(\(pm\M  n  C(w))  <  0.  This  gives  us  the  desired 
contradiction. 

For  the  converse,  suppose  that  F  =  (IF,  /Ci,  /C2,  VHi,Tll2)  E  F2n  —  F2P’fin.  Feinberg 
and  Samet  show  that  there  is  a  random  variable  X  such  that  for  each  world  w  E  W ,  agent 
l’s  expectation  of  X  is  positive  and  agent  2’s  is  negative.  That  is,  if  PrUI)i  is  agent  V s 
probability  distribution  at  world  w,  for  i  =  1,2,  then  we  have  Y)w'€/Ci(w)  X(w')  PiWj1(w')  > 
0  and  J2w’eKi(w)  X(w’)  PrWt2(w')  <  0  for  each  world  w  E  IF.  Without  loss  of  generality, 
we  can  assume  that  X{w)  is  rational  for  each  world  w  E  IF.  Suppose  IF  =  {irq, . . . ,  w^}; 
let  K  =  |"log2(Ar)] .  Then  we  can  easily  write  N  mutually  exclusive  propositional  formulas 
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if  i, . . . ,  ipN  using  the  primitive  propositions  Pi, ...  ,Pk]  these  all  have  the  form  q±A. .  .Aqj<. 
where  each  q.t  is  either  pi  or  -i p*.  We  can  then  define  a  structure  M  based  on  F  with 
an  interpretation  7r  such  that  [</?_,- ]m  =  {wj},  j  =  l,...,iV.  Taking  aj  =  X(wj),  j  = 

1, . . . ,  m,  then  C(a1pr1((p1)  H - b  aNpri(<pi)  >  0  A  a1pr2((pi)  H - b  aNpr2(<Pi)  <  0)  is 

satisfied  (in  fact,  valid)  in  M.  | 

Note  that  this  proof  crucially  depended  on  being  able  to  define  an  interpretation  7r 
appropriately.  This  is  why  frames  rather  than  structures  are  used  in  Definition  3.1. 

As  I  said  earlier,  I  defer  the  proof  of  Theorem  3.5  until  after  that  of  Theorem  3.8, 
continuing  instead  with  the  proof  of  Theorem  3.7. 

Theorem  3.7:  The  formula  A  ip2)  is  valid  in  Xi2p ,  but  is  not  provable  in  the 

system  AXf ’c’pr  +  CP2. 

Proof:  First  I  show  that  C(i/q  A  ip2)  is  valid  in  Xi2p .  Suppose  that  (Af,  w)  |=  C(ipi  A 
ip 2)  for  some  M  G  M,2P  ■  Let  P^w  be  the  common  prior  in  M .  let  \\\  be  the  set  of 
worlds  in  C{w)  where  pri(p>i)  >  pri(<p2)  is  satisfied,  let  W2  be  the  set  of  worlds  in  C(w) 
where  pr2(<p1)  <  pr2(p2)  is  satisfied,  and  let  W3  be  the  set  of  worlds  in  C(w)  where 
Pr  1(^1)  =  pr\(ip2)  A  pr2(p>i)  =  pr2(<p2)  is  satisfied.  As  in  the  proof  of  Theorem  3.2,  let 
f/i, . . . ,  U.m  be  a  partition  of  C(w)  into  sets  the  form  Ki{w').  Thus,  Prvr([</b]M  nC(u>))  = 
E7=1Prw(bi]  m  n  Uj).  Since  ipi  is  common  knowledge  at  w,  it  follows  that  Piw ([^i]m  H 
Uj)  >  Pr  w(IP2]m  n  Uj)  for  j  =  l,...,m.  Thus,  Prw([<pi]M)  >  P  %([72]m  n  C{w)). 
Moreover,  if  Piw(W1)  >  0,  then  Prw([<pi]M  nC(iw))  >  Piw(lip2]M  n  C(w)). 

Similarly,  since  ip2  is  common  knowledge  at  w,  it  follows  that  Prvr([<^i]M  H  C(w ))  < 
Ptw{\p2\mUC{w)).  Moreover,  if  Piw(W2)  >  0,  then  Prvi/([<Pi] MCC{w))  <  Pr^d^Wn 
C{w)).  Thus,  we  must  have  Pry^IFi)  =  PrM/(ld/2)  =  0.  It  follows  that  PrM/(I/F3)  = 
Pr w(C(w))  >  0.  (Recall  that  CP  requires  Piw  to  give  every  component  positive  mea¬ 
sure.)  But  it  follows  from  Cip  1  that  Prw{W3  fl  |p3]m)  >  l/2Prvp(IF3)  and  from  Cip2 
that  PrM/(IF3  C  |v?3]m)  <  l/2Prv^(W3).  This  gives  us  the  desired  contradiction. 

I  next  show  that  Aip2)  is  not  provable  AX7G,p7  +CP2.  As  usual,  let  Am  denote 

the  (m  —  l)-dimensional  simplex,  that  is  {(xi, . . . ,  xm)  G  Mm  :  x\  +  •  •  •  +  xm  =  1,  Xj  > 
0,i  =  1,. ..  ,m}.  Let  IF  =  {wi,w2,w3}  and,  for  (xi,x2,x3)  G  A3,  define  the  probability 
measure  Pr^1’^3)  on  \\T  by  taking  p =  Xi  for  i  =  1,  2,  3.  Consider  structures 
of  the  form  Af^1  ww)  —  (yv,  Ki,  K2,V7l[Xl’X2’X3\V7l2,7r),  where  K\  and  fC2  are  the 
universal  relations  on  IF  (that  is,  both  agents  have  only  one  cell  in  their  partition, 
consisting  of  all  of  IF),  Pr^i’*2’*^  =  Pr^1’*2’®3)  and  Pr„,i2  =  Pd1/4,1/4’1/2)  for  all  w  G  IF, 
and  7 r  is  such  that  p  is  true  at  w\  and  w2  and  false  at  u>3,  while  q  is  true  at  w\  and  w3 
and  false  at  w2.  Let  Ad  =  { M 3  :  Mx  |=  C(u>i  A  ip2}.  Thus,  Ad  —  {M(Xl’X2’X3l  :  xi  > 
x2  or  xi  =  x2  <  1/4}. 

Note  that  AF1//4,1/4d/2)  g  M2P’^n .  Thus,  aF1/4’^4,1/2)  satisfies  every  instance  of  CP2. 
Clearly  aF1/4,1/4’1/2)  ^  Ad.  However,  it  is  easy  to  see  that  for  each  e  >  0,  there  exists 
a  tuple  x  G  A3  such  that  |x  —  (1/4, 1/4, 1/2) |  <  e  and  Mx  G  Ad,  where  |x  —  y\  = 
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maxje{1)2,3}  | Xi  —  yi |  for  x,  y  £  A3.  Thus,  Aid1/4-1/4, 1/2)  is  in  the  closure  of  Ad,  in  an 
appropriate  topology. 

Claim  A.l:  For  every  instance  a  of  CP2,  there  exists  ea  >  0  such  that  Mx  (=  a  for  all 
x  such  that  \x  —  (1/4, 1/4, 1/2) |  <  ea. 

I  shall  prove  Claim  A.l  shortly;  first  I  show  why  it  suffices  to  prove  the  theorem. 
Suppose  there  is  a  proof  of  -iC(^ i  A  ip2)  in  the  system  AXK'c,pr  +  CP2.  By  definition, 
this  means  there  is  a  sequence  of  formulas  <fi,  ■  ■  ■  ,<pm,  each  of  which  is  an  axiom  of 
AXK'c'pr  +  CP2  or  follows  from  previous  steps  by  an  application  of  an  inference  rule, 
such  that  (f.m  =  Aifj2).  Let  e  be  the  minimum  of  ea  for  all  instances  a  of  CP2  that 

arise  in  (p1; . . . ,  tprn .  It  is  easy  to  see  that  each  formula  <pi,  i  =  1, . . . ,  m  is  valid  in  Mx  if 
\x  —  (1/4, 1/4, 1/2)|  <  e:  Each  formula  (pj  that  is  an  instance  of  an  axiom  other  than  CP2 
is  valid  in  every  structure;  if  ipj  is  an  instance  of  CP2,  this  follows  from  Claim  A.l  and  the 
choice  of  e;  and  if  ipj  follows  from  previous  formulas  by  application  of  an  inference  rule, 
this  follows  since  inference  rules  preserve  validity  if  the  formulas  they  are  being  applied 
to  are  valid.  In  particular,  iprn  =  -iC/^q  A  1/2)  is  valid  in  every  structure  Mx  such  that 
\x  —  (1/4, 1/4, 1/2)|  <  e.  But  this  contradicts  the  fact  that  C{v\  A  ip2)  is  valid  in  every 
structure  in  Ad,  by  choice  of  Ad. 

Thus,  it  remains  to  prove  Claim  A.l.  This  claim  may  seem  obvious.  Consider  any 

instance  a  =  ~^C{aipri{ui)  + - b  ampri(am )  >  0  A  a1pr2(a1)  + - b  ampr2(am )  <  0) 

of  CP2.  Since  a  is  valid  in  M*1/4'1/4’1/2),  it  seems  clear  that  making  only  slight  changes 
to  agent  l’s  probability  shouldn’t  affect  the  validity  of  a.  This  intuition  is  in  fact  true; 
however,  there  is  one  subtlety  involved  in  proving  it:  We  must  show  that  the  event 
corresponding  to  a3  does  not  change  as  a  result  of  small  changes  in  the  probability.  This 
is  the  content  of  the  next  claim,  which  says  that  we  can  partition  the  set  of  structure  Mx 
into  convex  regions  over  which  the  event  corresponding  to  a  given  formula  is  constant. 

Claim  A. 2:  For  all  formulas  p  £  Cf'Cp\  there  is  a  partition  nv  of  A3  into  a  finite 
number  of  convex  sets  (defined  by  linear  inequalities)  such  that  for  all  D  £  nv  and  all 
subformulas  ^  of  </?,  there  is  a  subset  IF/f  of  W  such  that  [V’Im5  =  W/f  for  all  x  £  D; 
that  is,  for  each  D  £  11^,  the  set  of  worlds  where  i\)  is  true  in  Mx  is  the  same  for  all 
xe  D. 

Proof:  This  result  follows  easily  by  induction  on  the  structure  of  p.  If  v?  is  a  primitive 
proposition,  then  we  can  take  11^  =  {A3};  for  example,  |p]Ma  =  {uq,  w2}  for  all  x  £  A3. 
We  can  take  11^  =  nKilfi  =  II c<p  =  11^  and  take  nvA^,  =  {A  n  B  :  A  £  IIV,  B  £  If/,}. 
Finally,  suppose  p  has  the  form  aipr;t{oi)  +  •  •  •  +  ampr,((7m)  >6.  If  i  =  2,  it  is  easy  to 
see  that  we  can  take  11^  =  II If  *  =  1,  consider  a  cell  D  in  IICTlA,.ACTra.  Let  Wf  = 
[ajlMH  for  x  £  D,  j  =  1^. .  ,m,  let  D>  =  {x  £  D  :  a±  Pr*(Wf)  +  ■  ■  -  +  am  Pra(W£)  >  6}, 
and  let  11^  =  {D>,  D  —  D>  :  D  £  11^^.,^^  t  }.  It  is  easy  to  check  that  this  partition  has 
the  desired  properties.  | 
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We  can  now  prove  Claim  A.l.  Suppose,  by  way  of  contradiction,  that  it  does  not  hold 
for  some  instance  o  =  -'C(aipri(oi)  +  -  •  • +ampri(arn )  >  0  Aaipr2(<7i)-|--  •  --|-ampr2(c7m)  < 
0)  of  CP2.  Since  A/l1/4,1/4’1/2)  satisfies  every  instance  of  CP2,  it  must  be  the  case  that 
1/4’i/4,i/2)  |_  a  Since  a  formula  of  the  form  Cip  is  true  at  either  all  worlds  in  W  or 

none  of  them,  the  set  Wf  must  be  either  0  or  W  for  each  D  £  IR.  Since  Claim  A.l  is 

assumed  to  fail  for  a,  there  must  be  some  set  D  £  IR  such  that  (1/4, 1/4, 1/2)  £  D  and 
[<r]Ma  =  0  for  all  x  £  D  (i.e.,  Wff  =  0).  Since  Mx  |=  a1pr1(a1)  +  •  •  •  +  ampri(om )  > 

0  for  all  x  €  D,  we  have  aiPrRlR/^)  +  •  •  •  +  a.m  PrRff7//)  >  0  for  all  x  E  D.  On 

the  other  hand,  since  Mx  |=  aipr2(<7i)  +  •  •  •  +  ampr2 (<xm )  <  0  for  f  e  h,  we  have 
fl!  Pr(1/4’1/4’1/2)(W^)  +  •  •  •  +  am  Pr(1/4’1/4’1/2)(W^)  <  0.  Since  (1/4, 1/4, 1/2)  £  D,  this 
gives  us  the  desired  contradiction,  proving  Claim  A.l  and  the  theorem.  | 

Before  proving  Theorem  3.8,  we  need  a  technical  lemma  regarding  separation  of  con¬ 
vex  sets.  It  is  well  known  that  two  closed  convex  subsets  of  Am  can  be  separated  by 
a  hyperplane.  (See  Rockafellar  [1972]  for  this  and  all  the  other  standard  facts  and  def¬ 
initions  from  convex  analysis  used  below.)  As  Samet  [1998]  observes,  we  can  take  the 
separating  point  to  be  0.  That  is,  if  A  and  Y  are  two  closed  convex  subsets  of  Am,  there 
exists  a  vector  a  E  JR™  such  that  for  all  x  E  X  and  y  €  Y,  we  have  a-x  >  0  >  a-y1  where 
•  denotes  inner  product.  The  following  lemma  generalizes  this  result  to  the  case  where 
X  and  Y  are  not  necessarily  closed.  Roughly  speaking,  it  says  that  either  two  convex 
subsets  of  Am  can  be  separated  by  a  hyperplane  H1:  or  they  can  be  weakly  separated 
by  Hi  (where  weak  separation  here  means  that  both  sets  may  intersect  Hi)  and,  if  we 
consider  the  intersection  of  the  two  sets  with  Hi,  these  sets  can  be  separated  by  a  hyper¬ 
plane  H-2,  or  they  can  be  weakly  separated  by  H2  and,  if  we  consider  the  intersection  the 
intersection  of  the  sets  with  H2,  . . . ;  moreover,  this  process  stops  after  a  finite  number 
of  sets  in  such  a  way  that  the  resulting  sets  can  be  (strongly)  separated  by  a  hyperplane. 

Lemma  A. 3:  Suppose  that  X1  and  X'2  are  disjoint,  convex  ( but  not  necessarily  closed) 
subsets  of  Am.  Then,  for  some  i*  €  {1,  2},  h  <  m  —  1,  and  vectors  a/, . . . ,  ah,  for  all 
y1  e  X l*  and  y2  €  A2-**,  we  have 

ai  ■  y1  >  0  A  di  ■  y1  <  0  A  (ai  •  y1  =  0  A  ai  ■  y2  =  0  =>■ 

...  A  , 

(4-1  •  V1  >  o  A  4_1  -  y2  <  0  A  (4_1  ■  y1  =  0  A  4_1  •  y2  =  0  => 

4  •  4  >  o  a4  •  y2  <  o)) . . .). 


Moreover,  if  X1  and  X2  are  defined  by  a  finite  collection  of  linear  equations  and  inequal¬ 
ities  with  rational  coefficients,  the  vectors  4,  •  •  •  ,4  can  all  be  taken  to  be  rational. 

Proof:  The  proof  proceeds  by  induction  on  the  maximum  dimension  of  A1  and  A2.  If 
it  is  1,  then  both  A1  and  A2  are  lines.  It  is  well  known  that  in  this  case  there  exists  a 
vector  a ,  i*  £  {1,2},  and  constant  c  such  that  a  ■  y1  >  c  >  a  ■  y2  for  all  y1  £  A**  and 
y2  £  A2-®*.  Moreover,  if  A1  and  A2  are  defined  by  linear  equations  and  inequalities  with 


26 


rational  coefficients,  c  and  all  the  coordinates  of  a  can  be  taken  to  be  rational.  Finally, 
as  Samet  observes,  since  yl .  y2  G  Am,  if  we  take  a!  to  be  the  result  of  subtracting  c  from 
all  the  coordinates  of  a,  we  have  a!  ■  y1  >  0  >  a'  ■  y2 . 

Now  suppose  by  induction  the  result  holds  for  sets  of  maximum  dimension  k ,  and 
suppose  that  in  fact  the  maximum  dimension  of  X 1  and  X 2  is  k  +  1.  Again,  by  standard 
results,  we  know  that  there  exists  a  vector  a i  such  that  cq  ■  x1  >  c>  fq  ■  y  for  all  x1  G  X1 
and  x2  G  X.  As  above,  we  can  assume  without  loss  of  generality  that  c  =  0  and,  if  X1 
and  X2  are  defined  by  linear  equations  and  inequalities  with  rational  coefficients,  that 
the  coordinates  of  cq  are  rational.  If  at  least  one  of  the  inequalities  above  is  strict,  we 
are  done  (replacing  fq  by  — iq  if  necessary).  If  not,  let  Y 1  =  {T1  G  X1  :  cq  ■  x1  =  0}  and 
let  Y2  =  {x2  G  X2  :  ai  ■  x2  =  0}.  Y1  and  Y2  are  disjoint  convex  sets  of  dimension  at 
most  k.  Moreover,  if  X1  and  X'2  are  defined  by  a  finite  number  of  linear  equations  with 
rational  coefficients,  then  so  are  l"1  and  Y2.  The  result  now  follows  from  the  induction 
hypothesis.  I 

The  expression  in  (1)  is  actually  an  expression  in  a  formal  language  for  reasoning 
about  linear  inequalities  introduced  in  [Fagin,  Halpern,  and  Megiddo  1990].  Since  this 
will  come  up  again  later,  it  is  worth  making  it  a  little  more  precise  now.  Suppose  that 
we  start  with  a  fixed  infinite  set  of  variables.  A  basic  inequality  formula  is  one  of  the 
form  cqaq  +  •  •  •  +  akXk  >  6,  where  cq, . . . ,  a*,,  b  are  rational  numbers  and  x\ are 
variables.  For  example,  2aq  —  x2  >  3  is  a  basic  inequality  formula.  An  inequality  formula 
is  a  Boolean  combination  of  basic  inequality  formulas.  An  assignment  (to  variables)  is  a 
function  A  that  assigns  a  real  number  to  every  variable.  We  define 


A  |=  cqaq  H - b  akxk  >  b  iff  cqA(aq)  H - b  akA(xk )  >  b. 

We  then  extend  |=  to  arbitrary  inequality  formulas,  which  are  just  Boolean  combinations 
of  basic  inequality  formulas,  in  the  obvious  way,  namely 

A(=-/  iff  Afif 
A  |=  /  A  g  iff  A  j=  f  and  A  \=  g. 

As  usual  we  say  an  inequality  formula  /  is  valid  if  A  |=  /  for  all  A  that  are  assignments 
to  variables.  If  /  is  a  valid  inequality  and  we  obtain  a  formula  a  in  C(f,pr  by  replacing 
the  variables  in  /  by  probability  terms  of  the  form  pri(ip)  (replacing  each  occurrence  of  a 
variable  Xj  by  the  same  probability  term),  then  the  resulting  formula  is  clearly  also  valid 
in  AAn.  Moreover,  as  shown  in  [Fagin,  Halpern,  and  Megiddo  1990],  it  is  provable  using 
just  the  axioms  11-16  for  reasoning  about  linear  inequalities  and  propositional  reasoning 
(Prop  and  Rl).  This  fact  will  be  used  in  the  proof  of  Theorem  3.8. 

Continuing  with  the  main  line  of  our  proof,  Samet  [1998]  shows  how  to  generalize 
the  special  case  of  Lemma  A. 3  where  X1  and  X'2  are  closed  convex  sets  to  the  case  of  n 
sets.  The  following  result  is  the  analogous  generalization  here.  I  omit  the  proof,  since  it 
proceeds  in  much  the  same  spirit  as  Samet’s,  using  the  ideas  of  Lemma  A. 3. 
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Lemma  A. 4:  Suppose  that  X1, . . .  ,Xn  are  convex  ( but  not  necessarily  closed)  subsets 
of  Am  such  that  n”=1  JV  =  0.  Then,  for  some  h  <  m  —  1,  i*  E  {1, . . . ,  n},  and  vectors 

dik,  i  =  1, . . . ,  n,  k  =  1, . . . ,  h,  such  that  Yfi=i  &ik  =  0,  for  k  =  1, ,  h,  for  all  x1  e  X'1 , 

i  =  1, . . . ,  n,  we  have 

Alii  aa  ■  &  >  0  A  (Aii  •  &  =  0  =* 

(Ail  0^-1)  •  S1  >  0  A  (Aii(a^-I)  •  ^  =  o  ^  (2) 

(di*h  ■  X1*  >  o  A  A;^*  ai{h-i)  ■  X1  >  0))) . .  .)■ 

Moreover,  if  the  sets  X1 ,  i  =  1, . . .  ,n,  are  each  defined  by  a  finite  collection  of  linear 
equations  and  inequalities  with  rational  coefficients,  the  coordinates  ofdik  can  all  be  taken 
to  be  rational. 

We  are  now  ready  to  prove  Theorem  3.8. 

Theorem  3.8:  AX((P  is  a  sound  and  complete  axiomatization  for  Cff’c*Pr  with  respect 
to  both  MPF  and  M.pp,fin  (and  hence  also  both  TPF  and  Xpp,fin). 

Proof:  The  completeness  proof  follows  closely  along  the  lines  of  the  completeness  proof 
given  in  [Fagin  and  Halpern  1994]  (which  in  turn  uses  a  combination  of  techniques  from 
[Fagin,  Halpern,  and  Megiddo  1990;  Halpern  and  Moses  1992;  Makinson  1966]),  which 
shows  that  AXp,pr  is  a  sound  and  complete  axiomatization  for  £lrf  pr  with  respect  to  A4n. 
The  added  complications  in  this  proof  are  dealing  with  the  fact  we  have  common  knowl¬ 
edge  in  the  language  and  with  CP.  The  techniques  for  dealing  with  common  knowledge 
are  well  known  [Fagin,  Halpern,  Moses,  and  Vardi  1995;  Halpern  and  Moses  1992],  so  I 
focus  here  on  dealing  with  CP. 

We  want  to  show  that  if  e  Cp,c,pr  is  valid  with  respect  to  A4.f)p,fin,  then  it  is 
provable  in  XX<fp .  Equivalently,  we  must  show  if  is  consistent  with  XX(PP ,  then  <p  is 
satisfied  in  some  structure  in  M. ffp,^n.  The  proof  actually  shows  how  to  construct  such 
a  structure. 

Let  Sub  (ip)  be  the  set  of  all  subformulas  of  <p  and  let  Sub+(<p)  be  the  set  of  subformulas 
of  and  their  negations. 

If  w  is  a  finite  set  of  formulas,  let  pw  be  the  conjunction  of  the  formulas  in  w.  The 
set  w  is  a  maximal  consistent  subset  of  Sub+(p )  if  w  C  Sub+(p),  pw  is  consistent  with 
Mpp^n ,  and  for  every  subformula  ip  of  p,  either  ip  or  ~ap  is  in  w.  (Note  that  w  cannot 
include  both  ip  and  -i ip,  for  then  pw  would  not  be  consistent.)  Following  Makinson 
[1966]  (see  also  [Fagin,  Halpern,  Moses,  and  Vardi  1995;  Halpern  and  Moses  1992]),  we 
first  construct  a  Kripke  structure  for  knowledge  (but  not  probability)  {W.  JCi, . . . ,  Xn,  7r) 
as  follows:  we  take  W ,  the  set  of  worlds,  to  consist  of  all  maximal  consistent  subsets 
of  Sub+(p).  If  u  and  v  are  worlds,  then  (u,v)  €  Ki  precisely  if  u  and  v  contain  the 
same  formulas  of  the  form  Cip ,  Kiip,  and  prj(ii’i)  +  •  •  •  +  pri(ipk)  >  b.  We  define  7r  so 
that  for  a  primitive  proposition  p,  we  have  7 r(s)(p)  =  true  iff  p  is  one  of  the  formulas 
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in  the  set  s.  Onr  goal  is  to  define  a  probability  assignments  VTZi, . . . .  VlZn  such  that 
M  =  (S,  /Ci, . . . ,  K,n,  VUx, . . . ,  VTln,  tt)  £  M%p  (in  fact,  it  will  be  in  M^p’fin,  since  W 
is  clearly  finite)  and,  moreover,  for  every  world  w  £  W  and  every  formula  ip  £  Sub+(p): 
we  have 

( .1/,  «•■)  |=  ip  iff  ip  £  w-  (*) 

Since  <£>  is  consistent,  we  must  have  ip  €  w  for  some  w  £  W .  Hence,  once  we  show  that 
there  exist  VlZi, . . .  ,V7 Zn  such  that  M  satisfies  (*),  we  are  done. 

It  is  easy  to  see  that  the  formulas  tpw  are  mutually  exclusive  for  w  €  W.  Moreover, 
we  can  show  that  AX((P  h  ip  4^  Vp,ew \ffv}(fv,  for  all  ip  £  Sub+(p).  Using  these  obser¬ 
vations,  we  can  show,  using  P1-P3  and  RP  (and  propositional  reasoning,  i.e. ,  Prop  and 
Rl)  that  AX((P  h  pr;(U)  =  YP{vew  |  4ev}  Pu(yu)  (cf.  [Fagin,  Halpern,  and  Megiddo  1990, 
Lemma  2.3]).  Using  this  fact  together  with  II  and  13,  we  can  show  that  an  ^-probability 
formula  ip  £  Sub+(p)  is  provably  equivalent  to  a  formula  of  the  form  Yjvew  cvlli{(Pv)  >  b, 
for  some  appropriate  coefficients  cv. 

For  each  world  u  and  agent  i ,  we  associate  a  set  Lui  of  linear  equalities  and  inequalities 
over  variables  of  the  form  x.lv.  for  v  £  K i(u).  We  can  think  of  xlv  as  representing  Pru  i(n), 
i.e.,  the  probability  of  world  v  under  agent  F s  probability  distribution  at  world  u.  We 
have  one  inequality  in  Lui  corresponding  to  every  ^-probability  formula  ip  in  Sub+(p). 
Assume  that  ip  is  equivalent  to  YPvew  cvVri{.'-Pv)  >  b.  If  ip  £  u,  then  the  corresponding 
inequality  is 

cvx^v  P  b. 

vElCi(u) 

(Note  that  there  are  no  terms  with  coefficient  xru  for  v  ^  K i(u).  Intuitively,  this  is  because 
PrU)j  is  a  probability  measure  on  K i(u),  so  we  can  treat  PrUjj(v)  as  0  for  v  ^  JCi(u).) 
Similarly,  if  -up  £  u,  then  the  corresponding  inequality  is 


E 

veJCi(u) 


cvXjv  b. 


Finally,  we  add  to  Lui  the  equality 

^  1  %iv  1* 
vElCi(u) 

Note  that  if  v!  £  K i(u),  then  Luit  =  Lui,  since  the  set  of  ^-probability  formulas  in  u  and 
u!  is  the  same. 

As  shown  in  [Fagin,  Halpern,  and  Megiddo  1990,  Theorem  2.2],  since  ipu  is  consistent, 
there  exists  a  probability  measure  Pr*  ,(  satisfying  (the  equation  and  inequalities  in)  Lui 
(taking  xru  =  Pr*  i(v)).  If  we  were  not  concerned  with  CP,  then  we  could  just  define  VlZi 
so  that  PrU)j  =  PrX.  Since  Lu^  =  Lui  for  v!  £  K, i(u),  we  could  also  assume  without  loss  of 
generality  that  Pr u,i  =  Pr^  for  u'  £  K i(u).  The  techniques  of  [Fagin  and  Halpern  1994] 
(and  of  [Halpern  and  Moses  1992]  in  the  case  of  common  knowledge)  then  show  that  (*) 
holds  for  the  resulting  Kripke  structure.  This  suffices  to  prove  Theorem  3.6.  However,  we 
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must  work  harder  to  complete  the  proof  of  Theorem  3.8,  since  the  probability  assignments 
do  not  necessarily  satisfy  CP. 

Note  we  can  identify  a  probability  measure  on  W  with  an  element  of  We  can 

thus  use  the  tuple  (xv  :  v  €  W)  to  denote  a  generic  probability  measure  on  W.  We  say 
that  a  probability  measure  Pr  =  (xv  :  v  £  W)  is  compatible  with  Lui  if  Pr(-j/Q(u))  satisfies 
Lu i  as  long  as  Pr(/Q(u))  7^  0.  (More  precisely,  as  long  as  the  tuple  (xiv  :  v  £  JCi(v )) 
satisfies  Lui ,  where  xlv  =  xv/Pr (JCi(u)).)  Let  X1  consist  of  all  the  probabilities  measures 
on  W  compatible  with  Lui  for  all  u  £  W.  If  n(l=1X*  7^  0,  then  we  are  done:  Choose 
Pr  £  fl"=1X*,  and  define  VIZi  so  that  Pr Uji  =  Pr(-|/Q(u))  if  Pr(/Q(u))  7^  0  and  Pr Uji 
is  some  arbitrary  probability  measure  satisfying  Lui  if  Pr(/C i(u))  =  0.  As  I  mentioned 
above,  with  this  choice  of  VI Zi,  (*)  holds. 

Now  suppose,  by  way  of  contradiction,  that  fl"=1  X1  =  0.  Since  A1, . . . ,  Xn  are  defined 
by  linear  equations  and  inequalities  with  rational  coefficients,  by  Lemma  A. 4,  there  exist 
h  <  \W\,  i*  £  {1, . . . ,  n}.  and  vectors  a^,  i  =  1, . . . , n,  k  =  1, . . . ,  h,  satisfying  (2)  (from 
Lemma  A. 4)  such  that  the  coordinates  of  are  all  rational.  Denote  by  f*  the  inequality 
formula  obtained  by  using  these  particular  vectors  in  (2),  and  taking  the  vectors  x,l 
to  be  {xiv  :  v  £  W).  Let  L*ui  consist  of  all  the  equations  and  inequalities  in  Lui  together 
with  the  equations  xlv  =  0  for  all  v  ^  K, i(u).  Note  that  if  (xiv  :  v  £  W)  satisfies  L*m 
for  all  a  £  W.  then  it  is  in  X\  Let  A "=1L*j  denote  the  inequality  formula  that  is  the 
conjunction  of  the  linear  inequalities  in  L*i?  i  =  1, . . .  ,n.  By  Lemma  A. 4,  A  1i=1L*ui  =>■  f* 
is  a  valid  inequality  formula,  for  each  u  £  W . 

Let  au  be  the  formula  in  C^,pr  obtained  by  replacing  each  occurrence  of  x.lv  in  A"=1L*,( 
by  pr.t(cpv)'.  similarly,  let  a*  be  the  formula  obtained  by  replacing  each  occurrence  of  xru 
in  f*  by  pri(<pv).  As  I  mentioned  earlier,  by  results  of  [Fagin,  Halpern,  and  Megiddo 
1990],  the  formula  au  =4>  a*  is  provable  using  11-16,  Prop,  and  Rl,  and  hence  provable  in 
AX^P.  Let  aw  be  Vu<Evv(ju.  By  straightforward  propositional  reasoning,  we  have 

AX^P  h  aw  ^  (&w  A  17*).  (3) 

As  shown  in  [Halpern  and  Moses  1992,  pp.  344-345],  we  have 

AX£ph  aw=>E(aw).  (4) 

(In  fact,  all  we  need  for  this  proof  are  the  axioms  for  reasoning  knowledge  and  common 
knowledge;  the  axioms  for  probability  and  inequalities  play  no  role.)  Moreover,  using 
(3),  Prop,  Kl,  Rl,  and  R2,  it  is  straightforward  to  show  that 

AX^P  h  E(aw)  =>  E(aw  A  a*).  (5) 

From  (4),  (5),  and  propositional  reasoning,  we  get  that 

AX°P  \~  aw  ^  E(aw  Aa*).  (6) 

Thus,  from  (6)  and  RC,  we  have  that 

AX£P  C(a*).  (7) 
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Propositional  reasoning  and  (7)  tells  us  that 

AXf3  h  (jw  =*  C(a *),  (8) 

for  all  w  E  IF.  But  note  that  C(cr*)  is  the  negation  of  an  instance  of  CP^.  This  says 
that  <jw  is  inconsistent,  for  each  w  E  W .  But  this  contradicts  the  assumption  that  w  is 
a  (maximal)  consistent  set. 

This  contradiction  completes  the  proof,  since  it  shows  that  fl’UjX*  A  0.  | 


Using  Theorem  3.8,  we  can  now  prove  Theorem  3.5. 

Theorem  3.5:  For  all  k  >  2,  there  is  no  set  Ak  of  formulas  in  Ck'c'pl  that  distinguishes 
FkCP  from  Tk  -  Tffp . 


Proof:  First  suppose  k  =  2  and,  by  way  of  contradiction,  that  there  is  some  set  A-2 
of  formulas  that  distinguishes  F.fp  from  JP2  —  Fffp .  By  part  (a)  of  Definition  3.1,  A-2 
must  be  a  subset  of  the  set  of  formulas  valid  in  Ffp.  Now  consider  the  frame  F*  of 
Example  3.4.  Since  F*  E  T2  —  iF.fp,  there  must  be  a  formula  in  Ai  that  is  not  valid  in 
F* .  Thus,  to  get  a  contradiction,  it  suffices  to  show  that  every  formula  valid  in  Tpp  is 
also  valid  in  F* .  By  Theorem  3.8,  it  suffices  to  show  that  every  instance  of  an  axiom  of 
AX^P  is  valid  in  F*.  By  Theorem  3.6,  it  is  immediate  that  every  axiom  other  than  CP'2 
is  valid  in  F*.  The  proof  that  CP'2  is  valid  in  F*  proceeds  along  the  same  lines  as  the 
proof  that  CP2  is  valid  in  F*,  so  I  omit  details  here. 

Finally,  in  the  case  that  k  >  2,  define  Ff  =  (IF,  /Ci , . . . ,  AC*,,  VTZi, . . . ,  Wk),  where 
W,  /Ci,  /C2,  V7Z1:  and  VTZ2  are  as  in  F* .  and  /C2  =  . . .  =  /C*,,  P7v2  =  . . .  =  VI Zk-  Again, 
it  is  straightforward  to  show  that  every  instance  of  CP'k  is  valid  in  Ff.  This  suffices  for 
the  proof,  just  as  in  the  case  k  =  2.  | 


Theorem  3.9:  A  formula  in  CK,c’Pr  is  valid  with  respect  to  A4PP  (resp.,  A4n)  iff  it  is 
valid  with  respect  to  A4^p,fin  (resp.,  Mfff). 

Proof:  I  start  by  considering  the  case  of  M.f(p  and  .\4((p^n.  Clearly  if  is  valid  with 
respect  to  AfPF,  it  is  also  valid  with  respect  to  Af  pp,^n.  For  the  converse,  it  suffices  to 
show  that  if  <p  is  satisfied  in  M. then  it  is  satisfied  in  M<fp,fin]  that  is,  if  ip  is  satisfiable 
at  all,  it  is  satisfied  in  a  finite  structure.  This  follows  from  Theorem  3.8  and  its  proof. 
If  ip  is  satisfied  in  Adpp  then,  by  Theorem  3.8,  p>  must  be  consistent  with  AXPP.  The 
proof  of  Theorem  3.8  then  shows  how  to  construct  a  structure  in  Af  satisfying  ip. 
(In  fact,  the  structure  has  at  most  2^  worlds,  where  \<p\  is  the  length  of  ip,  viewed  as  a 
string  of  symbols,  since  it  is  not  hard  to  show  by  induction  on  \<p\  that  \Sub(ip)\  <  |<p|). 
I  provide  an  alternate  proof  of  this  result  here,  since  it  gives  further  insight  into  what  is 
going  on. 

Suppose  that  p>  is  satisfied  in  some  structure  M  =  (IF,  /C i, . . . ,  /Cn,  VlZi, . . . ,  Wn,  7r)  e 
M?fp.  Since  M  E  Adpp,  there  is  some  probability  Pr w  on  W  as  required  by  CP.  Define 
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an  equivalence  relation  ~  on  the  worlds  in  M  by  taking  w  ~  w'  if  w  and  w'  agree  on  all 
formulas  in  Sub(p).  That  is,  if  (M,w)  \=  ip  iff  ( M,w ')  |=  ip  for  all  ip  G  Sub(p).  Let  [u>] 
be  the  equivalence  class  of  w  according  to  that  is,  [u>]  =  {w1  :  w  ~  w'}.  Note  that 
there  are  at  most  2]'Sub^  (<  2^)  equivalence  classes. 

Define  a  structure  M'  =  {W\  ... ,  K'n.  VIZ . . . ,  VTZ'n .  n')  as  follows: 

•  W'  =  (H  :  w  G  IT}, 

•  /C'(H)  =  { [w1]  :  [w]  and  [ w ']  agree  on  all  formulas  in  Sub(p)  of  the  form  Kiip,  Cip , 

and  aipr,((pi)  H - h  ampri((pm)  >  6}, 

•  P7l-([w])  =  (/Ci([w]),2/Ci^u’]),Pr[tt]):i),  where  PrMii  =  Prw-(-|/Ci([tt;]))  if  Prw-(/Ci([tw])  > 
0,  while  if  Prjy(/Q ([w])  =  0,  then  Pr^]^  is  a  probability  measure  on  K, i([u>])  that 
satisfies  all  the  constraints  in  LWI . 

•  Tr'([w])(p)  =  true  iff  p  G  Sub(p)  and  irpw)(p)  =  true. 

Now  a  straightforward  proof  by  induction  on  the  structure  of  shows  that  (M\  [u>])  (=  ip 
iff  ( M,w )  |=  ip ,  for  all  w  G  W  and  ip  G  Subpp).  The  ideas  are  standard  (see,  for  example, 
the  completeness  proofs  in  [Halpern  and  Moses  1992]),  so  I  leave  details  to  the  reader. 
Thus,  if  p  is  satisfied  at  some  world  in  M,  say  vj0:  then  (M',  [u>0])  |=  p.  Moreover,  Prw 
defines  a  common  prior  on  W' .  Hence,  M'  G  Mp;p'^n.  This  completes  the  proof. 

The  argument  in  the  case  of  Mn  and  A4fin  is  almost  identical,  and  is  also  left  to  the 
reader.  | 

Theorem  3.10:  AXp'pr  is  a  sound  and  complete  axiomatization  for  Cff,pr  unth  respect 
to  both  Mfip  and  M.%p,fin  (and  hence  also  both  and  Jr((p^n). 

Proof:  Clearly  AX,(ypr  is  sound  with  respect  to  M(jp  and  j\4p(P  fin ,  since  it  is  already 
sound  with  respect  to  A4n.  We  want  to  show  that  every  formula  in  Cp,pr  that  is  valid  in 
M((p  (resp.,  M((p^n)  is  provable  in  AXp,pr.  As  in  the  proof  of  Theorem  3.8,  it  suffices 
to  show  that  every  formula  in  £JPpr  consistent  with  AXiPpr  is  satisfied  in  some  structure 
in  M%p’fin.  Since  AX]PpT  is  complete  with  respect  to  A4n,  we  know  that  every  formula 
consistent  with  AXlPpr  is  satisfied  in  some  structure  in  M.n.  By  Theorem  3.9,  we  can 
assume  without  loss  of  generality  it  is  satisfied  in  a  structure  in  A 4£n.  Thus,  it  suffices 
to  show  that  every  formula  that  is  satisfied  in  some  structure  in  AiPpf  is  also  satisfied  in 
some  structure  in  MPP'^n . 

Define  the  depth  of  a  formula  p  in  £p,pr:  denoted  d(p):  as  follows: 

•  d{p)  =  0  for  a  primitive  proposition  p. 

•  ->i;)  =  d(ip ), 

•  d{ip  A  ip')  =  ma x(d(ip),  d(ip')), 
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d(Kiip)  =  1  +  d(ip), 

d(a1pri('tp1)  H - b  akpri(ipk )  >  b)  =  1  +  max(d($i), . . . ,  d(ipk)). 


Let  a  situation  be  a  pair  (M,  u>)  consisting  of  a  structure  M  and  a  world  w  in  M.  Two 
situations  (. M,w )  and  (M'pw1)  are  equivalent  up  to  depth  k.  denoted  (. M,w )  =k  (M',w'), 
if,  whenever  p  is  a  formula  with  d(p)  <  k,  then  (M,  w)  (=  p  iff  ( M ',  w')  [=  p. 

The  proof  depends  on  two  key  observations,  which  I  state  informally  here  and  then 
make  more  precise. 

1.  If  a  formula  p0  €  C^,pr  is  satisfiable  at  all,  it  is  satisfied  at  the  root  of  a  “treelike” 
structure  of  height  at  most  d(p0). 

2.  Adding  worlds  to  the  leaves  of  this  treelike  structure  that  are  “distance”  greater 
than  d(po)  away  from  the  root  does  not  affect  the  truth  of  p0  at  the  root. 


To  make  this  precise,  I  use  a  standard  idea  from  modal  logic  of  “unwinding”  a  struc¬ 
ture  to  a  tree.  Given  a  structure  M  =  (W,  JCi, . . . ,  /C„,  VlZi, . . . ,  VlZn,  7r)  £  we 

define  a  “treelike”  structure  T*M  w  k .  for  each  world  w  £  W  and  k  >  0,  such  that 
(Thwtk:r)  =k  where  r  is  the  “root”  of  as  follows:  The  first  step  is  to 

define  (rooted,  labeled,  directed)  trees  7 'v; . „■ . k .  by  induction  on  k.  The  tree  Tm,w, 0  just 
consists  of  a  single  node  r,  labeled  w.  (In  general,  nodes  will  be  labeled  by  worlds  in  W, 
but  more  than  one  node  may  be  labeled  with  the  same  world,  and  edges  will  be  labeled 
by  agents.)  TMiWtk+1  consists  of  a  root  node  r  labeled  by  w  and,  for  each  agent  i  and 
world  w1  /  w  such  w'  £  K i(w),  a  directed  edges  labeled  by  i  leading  from  r  to  the  root 
r'  of  T'lM  w  k .  where  TlMwk  is  the  result  of  removing  all  the  i-successors  of  r'  in  TM  w^k 
(and  all  the  nodes  reachable  from  these  i-successors).  We  can  easily  show  by  induction 
on  k  that  this  construction  guarantees  that  there  is  no  path  in  TMjWik+1  that  contains 
two  consecutive  Ledges,  for  any  agent  i. 


Now  let  the  structure  T*M  w  k 
be  defined  as  follows: 


(yyM,w,k  j^M,w,k  'p'j^M,w,k  ^ M,w,k 'j 


W M,w’k  consists  of  the  nodes  in  TM)W)k. 


K ¥'w'k  is  the  smallest  equivalence  relation  such  that  if  n!  is  an  Lsuccessor  of  n  in 


TM,w,k,  then  (n,  ri)  £  JC 


M.w.k 


vn1 


"(n)  =  (K-fI'w'k(n,)12K'i  ’  ’  (n),Prn  i),  where  if  n  is  not  a  leaf  in  TM  w  k  or  if 


•y  M,w,k  ( _ \  /  ^ 

n  is  a  leaf  in  TMw  k  and  is  the  Lsuccessor  of  some  (non-leaf)  node,  then  for  all 
n!  £  /Cf1,w’k(n),  we  have  Pr n^{n')  =  Pr f(n),i(f(n')),  where  /  is  the  function  that 
associates  with  each  node  in  TM  w  k  the  world  that  labels  it;  if  n  is  a  leaf  in  TM  w  k 
and  is  not  the  Lsuccessor  of  some  non-leaf  node,  then  Pr n^{n)  =  1.  (In  this  case, 
it  is  easy  to  see  that  )Cfr’w'k(n)  =  {n}.) 
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•  ^M,w,k^n^^  _  71 -(/(rt))(p). 

It  is  easy  to  check  that  Prn)i  is  indeed  a  probability  measure  on  K^’w,k(n)  for  each  agent 
i  and  n  e  WM,w,k. 


Lemma  A. 5:  (. M,w )  =fc  (T^wk,r),  where  r  is  the  root  ofTMw )k. 


Proof:  For  each  node  n  €  wM,w,k,  let  dist(r,n)  be  the  distance  from  r  to  n  in  TM,w,k- 
A  straightforward  induction  on  d(ip),  which  I  leave  to  the  reader,  can  be  used  to  show 
that  if  d(ip)  +  dist{r:n)  <  k,  then  (T^wk,n)  \=  ip  iff  (. M,f(n ))  |=  ip.  Since  d(r,r)  =  0 
and  f(r)  =  w.  this  gives  us  the  desired  result.  I 

Lemma  A. 5  actually  shows  proves  both  of  the  informal  observations  above.  It  shows 
that  if  a  formula  </30  is  satisfiable  at  all,  it  is  satisfied  in  a  treelike  structure  of  height  at 
most  d{ip 0),  since  if  <^0  is  satisfied  at  the  situation  (. M,w ),  then  T*M  w  k  is  the  required 
treelike  structure.  Moreover,  it  shows  that  making  changes  in  this  treelike  structure 
by  adding  worlds  to  leaves  does  not  affect  the  truth  of  <p0,  since  if  M'  is  the  resulting 
structure,  we  will  still  have  T*Mwk  =  T^,  k.  The  remainder  of  the  argument  uses  this 
second  point  (and  makes  it  more  precise). 

Let  n  be  a  leaf  of  TM  w  k  and  suppose  that  n  is  the  ^-successor  of  some  node  n' .  We 
construct  a  structure  M*  that  is  almost  identical  to  T*M  w  k .  Informally,  we  add  a  new 
world  n*  which  is  the  z'-successor  of  n  for  some  i'  ^  i ,  and  assume  that  all  agents  assign 
n*  probability  1.  More  precisely,  let  M*  =  (IF*,  . . . ,  /C* ,  VlVXl . . . ,  VI Z*m,  7r*),  where 


IF*  =  WM,w,k  U  {n*}, 


1C*  =  K, 


M.w.k 


U  {(«*,  n*)}  for  j  ±  i';  =  /Cf  ’ w’k  U  {(n,  n*),  (n*,  n),  (n*,  n*)}, 


W*{n')  =  (JC*(n'),  2K*^n'\  Pr*,^,  where  Pr*(j  =  if  n1  ^  n*  and  (n',j)  ^ 

(n,  and  Pr nij  is  the  unique  probability  measure  such  that  Prn/j(n*)  =  1  if 
n'  =  n*  or  (n',j)  =  (n,i'), 


7r*(n/)  =  7 r(n')  if  n'  ^  n*  (the  definition  of  7 r*(n*)  is  irrelevant). 


Our  construction  guarantees  that  (a)  T*Mwk  =  T*M.wk  (since  the  way  we  changed 
^M,w,k+ i  to  get  M*  involved  only  the  addition  of  a  node  k  +  1  away  from  the  root)  and 
(b)  M*  e  To  see  (b),  note  that  there  is  a  common  prior  that  gives  probability 

1  to  n*. 

We  can  now  easily  complete  the  proof  of  Theorem  3.10.  Suppose  <^0  is  a  formula  satis¬ 
fied  in  some  situation  (M0,w0),  where  M  €  and  d(cp0)  =  k.  Using  the  construction 
above,  we  get  a  structure  M*  e  A4^p,fin  such  that  T*Mwk  =  w  k.  Thus,  if  r  is  the 
root  of  M*,  we  have  (. M*,r )  |=  <p0.  | 
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Theorem  3.11:  For  all  n,  no  set  A  of  formulas  in  Cp,pr  distinguishes  Fffp,^n  from 

pfin  p  CP  y fin 

*rn  *rn 


Proof:  Suppose  A  distinguishes  Fffp,fin  from  pn  —  Fffp,fin.  Let  F  E  Ffn  —  Fffp,fin.  By 
part  (b)  of  Definition  3.1,  there  must  be  some  formula  p  E  A  that  is  not  valid  in  F.  But 
by  part  (a)  of  Definition  3.1,  A  must  be  a  subset  of  the  set  of  formulas  valid  in  Fffp^n. 
By  Theorem  3.10,  it  follows  that  the  formulas  in  A  are  also  valid  in  pn .  Thus,  p  must 
also  be  valid  in  F,  giving  us  a  contradiction.  | 
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